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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, far 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1786 and 3573-5358. The polypeptides sequences are 
designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 
1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO:l-1786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 
1 5 specific domain or truncation of the peptides encoded by SEQ ID NO: 1-1786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQIDNO:l-1786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 . The sequence information . 
can be a segment of any one of SEQ ID NO:l-l 786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ ID NO:l-1786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence informationis provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readablefoimat 

This invention also includes the reverse or direct complement of any of the nucleic acid 
3 0 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinantproduction of 
protein, and use in the generation of anti-sense DN A or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ED NO: 1 - 1 786 and 3 573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 -1786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al„ Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 
1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO:l -1 786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 -1786 and 3573-5358. The polynucleotides of the 

1 5 present invention also include, but are not limited to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
arnino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 

20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing, 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 

25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO:M786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotidesof (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents"thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 



3 



WO 01/53312 PCT/US00/34263 
The invention also provides compositions comprising a polypeptide of the invention. 

Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 

hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1 992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 

utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 

identification of subjects exhibiting a predisposition to such conditions. The invention provides 

5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 

the sample with a compound that binds to and forms a complex with the polynucleotide of 

interest for a period sufficient to form the complex and under conditions sufficient to form a 

complex and detecting the complex such that if a complex is detected, the polynucleotide of 

interest is detected. The invention also provides a method for detecting the polypeptides of the 

1 0 invention in a sample comprising contacting the sample with a compound that binds to and forms 

a complex with the polypeptide under conditions and for a period sufficient to form the complex 

and detecting the formation of the complex such that if a complex is formed, the polypeptide is 

detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the target gene products. Compounds and other substances can 

5 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 
4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a" 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or 'Immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence S'-AGW binds to the 
complementary sequence S'-TCA-S'. Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
tt oligonculeotide ,> are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-Iike or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory dements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 



WO 01/53312 PCTYUS00/34263 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or raicroarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh etal. (Walsh, P.S.etal., 1992, PCR Methods Appl 1:241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et ah, 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al, 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-l 786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 
1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be folly 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used.- The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five men The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a foil match (l^ 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment/' "portion/' or ''segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derealization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
5 in human proteins. 

The term M variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
10 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 

10 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant {e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host ceil. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted 11 proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, F.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et aL (1998) Annu. Rev. Immunol. 
16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 
art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 > 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 

12 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (/.e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 
Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment/' UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ IDNO:1787-3572 and 5359-7144; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 
polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDN A sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5* and 3' sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1 -1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO:M786 and 3573-5358 may be used as the 

15 basisfor suitable primer(s) that allow identification and/or amplification of genesin appropriate 
genomic DNA or cDN A libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDN A and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-1 786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences,but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 -1 786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO: 1-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO: 1-1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. BioL 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

' The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequencc insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

1 5 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1 982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO:l-1786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host ceil containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukary otic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1 - 1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO:l-1786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 

pBs KS, pNH8a, pNH!6a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3 3 pKK233-3, 

pDR540, pRITS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 

pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al., 

Nucleic Acids Res. 1 9, 4485-4490 (1991), in order to produce the protein recombinantly. Many 

suitable expression control sequences are known in the art. General methods of expressing 

recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 1 85, 537-566 (1990). As defined herein H operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metaliothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g. , the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors, for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli> Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections arc combined with an appropriate promoter and the structural 

1 0 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

1 5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et aL, Nat Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA, The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense 1 ' nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO:l-1786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g. , SEQ ID 
NO:l -1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5~(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxyme%iajninomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguaiiine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methy]-2-thiouracil, 

3- (3-amtoo-3«N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 

1 0 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

1 5 receptors or antigens. The antisensc nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 oc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual P-units, the 
strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2-o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al (1987) 

25 FEBSLett 21 5: 327-330). 



4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (ie. 9 SEQ IDNO:l- 
35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al. U.S. Pat. 
No. 4,987,071 ; and Cech et al. U.S. Pat No. 5,1 1 6,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, e.g., Bartel et al., (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region {e.g. , promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1 991) 
Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Arm. N. Y. Acad Set 660:27-36; and 
10 Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) BioorgMed 
1 5 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Perry-O'Keefe et al. (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction en2ymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al (1 996), above; Perry-O'Keefe ( 1 996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g. , to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DN A chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DN A chimeras 

can be performed as described in Hyrup (1996) above and Finn et al. (1 996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5'-(4-methoxytrityI)amino-5 , -deoxy-thymidine phosphoramidite, can be used between the PNA 

and the 5' end of DNA (Mag era/. (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

10 DNA segmcnt'(Finn et al. (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5 r DNA segment and a 3' PNA segment. See, Petersen et al. (1 975) Bioorg Med Chem 

Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

1 5 cell membrane (see, e.g., Letsinger et al, 1989, Proc. Natl Acad Sci. U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/1 0134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No, WO91/09955. It is also contemplated that, in addition to heterologous promoter 
5 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 
1 0 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be aprokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et ah, Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
15 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMR 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E coli and B. subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1 989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
10 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

15 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sberwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.: and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety, 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising: the amino acid sequences set forth as any one of SEQ ID NO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1 786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 
amino acid sequences set forth as SEQ ID NO:1787-3572 and 5359-7144 or the corresponding 
5 full length or mature protein; and "substantial equivalents ,, thereof (e.g., with at least about 
65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, typically at least about 95%, more typically at least about 98%, or most typically at . 
least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 
allelic variants may have a similar, increased, or decreased activity compared to polypeptides 
1 0 comprising SEQ ID NO: 1 787-3572 and 5359-71 44. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
15 Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a bydrophihe, e.g. y pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the ait can be utilized to obtain any one of the 

isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 

5 structural and/or conformational characteristics with proteins may possess biological properties 

in common therewith, including protein activity. This technique is particularly useful in 

producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

example, in generating antibodies against the native polypeptide. Thus, they may be employed 

as biologically active or immunological substitutes for natural, purified proteins in screening of 

1 0 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

1 5 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells arc grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biologicaJ/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanme-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.SA. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No, 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavaiin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, NJ.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG©") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 

as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 

Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 

5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 

may exhibit improved properties such as activity and/or stability. Examples of moieties which 

may be fused to the polypeptide or an analog include, for example, targeting moieties which 

provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 

10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 

fused to the polypeptide include therapeutic agents which are used for treatment, for example, 

immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 

alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403^10 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 BioL, Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein. incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., el al., J. Mol. 
BioL 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 

32 



WO 01/53312 PCT/US00/34263 

another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention- In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operativcly 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-tenninus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

1 0 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the intmunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g., cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DN A techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4,8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

10 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex v/vo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

15 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 



34 



WO 1)1/5331 2 PCT/US00/34263 

the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 
10 International PublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
20 replace a gene' s existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
3 5 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the taigcting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. PatentNo. 5,578,461 to Sherwinet al.; International Application No. PCTAJS92/09627 
(WO93/09222)by Seldenet aL; and International Application No. PCT/US90/06436 

1 5 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination arc 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout'* animals. Knockout animals, preferably non-human mammals, can be 
1 5 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as mode! systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
20 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic aniriaals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 
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4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et ah, Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E, R Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.103 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent eel] proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3,1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1 986; Bertagnolli et ah, J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al. Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al, I. Immunol. 149:3778-3783, 1992; Bowman et aL, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bortomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol I pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med, 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1 983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1-Bennett, F., Giannotti, 1, 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscicnce (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et aL, Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger etaL, Eur. J. Immun. 11:405-41 1, 1981; Takai et aL, J. Immunol 
137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 1988. 

4 J0.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Fit- 
3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 
inflammatory protein 1 -alpha (MJP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 
undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

15 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al, J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

1 0 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction, with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post inadiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al, Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1 994; Hirayama et ah, 
5 Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, L K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al, eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben ct 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/iigament-likc tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

10 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mextz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SOD)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e&, anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enliancement test (Lastbom et ah, Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et al. Allergy 54: 446-54, 1 999), guinea pig skin sensitization 
test (Vohr et al, Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et aL, 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T ceils, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 

47 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
aL, Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad. Sci USA, 89: 1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or N2B hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

15 MHC class I alpha chain protein and 02 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1 , B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 * Wiley-Interscience (Chapter 3 , In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnoili et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
15 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatoniaet al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:63 1-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al, Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-117, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 

10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al, Endocrinology 91 :562-572, 1972; Ling et al., Nature 321 :779«.782, 1 986; Vale et al., Nature 

20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemptactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired she of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
3 5 stimulate, directly or mdirecdy, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474,1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous celJ carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotheTapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
DaunombicLn HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-FluorouraciI (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MIX), Mitomycin, Mitoxantrone HCl, Octreotide, Plicamycin, Procarbazine HCl, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl Can. Inst., 52: 92 1 -30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

25 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their, ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al, J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et al, J. Exp. Med. 169:149-160 1989; Stoltenborg et al, J. Immunol. Methods 

175:59-68, 1994; Stitt et al. Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BI Acore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 



4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

35 nucleic acids expressing the polypeptide or a fragment thereof Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-68 (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallei synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead'*) to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 410.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

1 0 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host celL The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-L Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

1 5 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
1 0 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

3 0 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

3 5 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

10 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marifr-Tooth Disease). 



4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 



4.10J9 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
1 5 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PGR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 



4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et aL, 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
15 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 1 5, 1 8, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 



4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about O.Ol^g/kg to 100 mg/kg of body weight, with 
the preferred dose being about Cl^g/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 



15 4,12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredients). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, BL-11, IL-12, 
IL-13, IL-14, IL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transfonning growth factors (TGF-a and TGF-0), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 

10 IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in muJtimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
As an alternative to being included in a pharmaceutical composition of the invention 

1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or antithrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

1 5 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician^ provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutical!^ These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 

dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizing processes. Proper formulation is dependent upon the route of adrninistration chosen. 

When a therapeutically effective amount of protein or other active ingredient of the present 

5 invention is administered orally, protein or other active ingredient of the present invention will 

be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

the pharmaceutical composition of the invention may additionally contain a solid carrier such as 

a gelatin or an adjuvant The tablet, capsule, and powder contain from about 5 to 95% protein or 

other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

15 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutically acceptable carriers well known in the art Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g , 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of; e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other Iow-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyxrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

1 0 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmacetrtically compatible counter ions. Such pharmaceutical^ 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutical^ 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 jig to about 100 mg (preferably about 0.1 fig to about 10 mg, more preferably 
about 0.1 fig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxy apatite, polylactic acid, poly gly colic acid and poly anhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aiuminate-phosphate and processing to alter pore size, particle size, particle shape, and 
1 0 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-p), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e^., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes, 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieves circulating 
concentration range that includes the IC 50 as determined in cell culture (/.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local, 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 ^g/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 fig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subjects age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a t» F a b* and 

10 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGj, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ED NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 1 5 amino 
acid residues* or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat Acad ScL USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, arc also provided herein, 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinant^ expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerm and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1 975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or axe capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice, Academic Press, (1986) pp. 59-1 03). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfiised, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol, 133:3001 (1984); Brodeur et aL, Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hyhridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
1 0 enzyme-linked izrimunoabsorbent assay (ELIS A). Such techniques and assays are known in the 
art The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal Biochemu 107:220 (1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13,2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(aV)2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et aL, 
Nature, 321:522-525 (1986); Riechmann et aL, Nature . 332:323-327 (1988); Verhoeyen et aL, 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et aL, 1986; Riechmann et aL, 1988; and Presta, Curr.Qq Struct Biol.. 
2:593-596 (1992)). 



5.133 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et aL, 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et aL, 1 985 In; Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et aL, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et aL, 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. MoL Biol.. 227:381 (1991); 
Marks et aL, J. MoL BioL. 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated Upon 
10 challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et aL (Bio/Technology 10. 779-783 (1992)); Lonberg et aL 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
15 Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnolog y 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and genu cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
10 U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 
15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5,13.4 F ab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b»)2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated 
by reducing the disulfide bridges of an F^ fragment; (iii) an F ab fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chain/Ught-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 

10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al 9 1991 EAtBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 

15 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 

20 al., Methods in Bnzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 

25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

30 Bispecific antibodies can be prepared as fiill length antibodies or antibody fragments (e.g. 

F(ab') 2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab* fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab' -TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of ei]zymts. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et aL, I Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab* fragment 

1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547- 1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (V^) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the Vh and V L domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-anti genie arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRm (CD16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 ' binds the protein antigen described herein and further binds tissue factor (TF). 



5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

10 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrirnidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 



5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176:1191-1195(1992) 
25 and Shopes, J. Immunol., 148: 291 8-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymaticaily active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Ateurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enornycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
2I2 Bi, ,3I I, I3, In, 90 Y,and I86 Re. 
10 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succimmidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
1 5 bis-(p-^a2x>niurnbenzoyl>ethylenediarnine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-drnitroben2ene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al, Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatoben2yI-3-mcthyldiethyIene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
25 conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a mamrfacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 
software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et al., J. MoL Biol. 215:403-41 0 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORJFs may 
be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means'' refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTO and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues- However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, haiipin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et aL, Nucl. Acids Res. 6:3073 (1979); Cooney et aL, Science 15241:456 (1988); and Dervan 
et aL, Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RN A transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
15 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
20 comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
30 binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
35 amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al. Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
5 and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, ceils or 
1 0 extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
1 5 provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4,17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et aL, U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF orEMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241 ;456 (1988); and Dervan et 
al., Science 25 1 : 1360 (1 99 1)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 



4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO: 1-1 786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a riiixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides,^., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (NagataefaJ, 1985;Dahlene/a/., 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 
20 references being specifically incorporatedherein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude etal (1994) Proc. Nad. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchasedfrom Dynal, Oslo. 
25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used Nunc 
Laboratorieshave developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Co valink NH. Co vaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5 r -end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussene/ */., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussen et al, (1 991), In this technology, a phosphoramidatebond is employed 
(Chu et al., (1 983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
CovaLink NH secondary amino groups that are positioned at the end of spacer aims covalently 
grafted onto the polystyrene surface through a 2 run long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 1 0 min. at 95°C and cooling on ice for 1 0 min. Ice-cold 0. 1 M 1-methylimidazole, 
pH 7.0 (l-Melm?), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide0.2M l-efoyl-3-(3-dime%lammop^ dissolved in 

1 0 mM 1-Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplatedthat a further suitable method for use with the present invention is that 
described in PCX Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotidebound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotideis then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
anays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodoretal. (1991)Science251(4995)767-73 ) incorporatedhercin by reference. Probesmay also 
be immobilized on nylon supports as described by Van Ness et al (1 99 1 ) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specificallyincorpoiatedherein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5 -amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light^generated synthesis described by Pease et al, (1 994) PNAS USA 91 (1 1) 5022-6, incorporated 
herein by reference). These authors used currentphotolithographictechniquesto generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5-protectedi\r-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3, plasmid or lambda vectois and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 924-9.28 of Sambrook*/ 
al (1989), shearing by ultrasound and NaOH treatment 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments form the small 
molecule pUCl 9 (2688 base pairs). Fitzgerald etal (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/JI** digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that Cv/JI** restricts pyGCPyand 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
importantto denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiterplate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subaxrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subairays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 

1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specificationare hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5JJ EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDN A libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 
30 In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5 9 direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gbpri 114,andUniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 3 00 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fasta.bioch.virginia,edu^ which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R Pearson, Methods in Enzymology, 183:63-98 
15 (1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5*2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready , ed-ext and gc-zip-2 (Hyseq, Inc.). The frill-length nucleotide, including splice variants 
resulting from these procedures are shown in the S equence Listing as SEQ ID NOS : 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO : 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 1 17, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

15 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a frill length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTYanaVor BLAST against Genbank (i.e., dbEST version 117, gbpri 117, 

25 UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 
Table I shows the various tissue sources of SEQ ID NO: 328-1 41 3. 

30 The nearest neighbor results for SEQ ID NO: 328-141 3 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 • examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP V 1 . 1 program (from 

15 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaiyotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

25 5-3-2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 117,gbpri 117, 

UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-xeady,ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 14 14-1 652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1652. 
The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ED NO: 1414-1652 from 
5 Genpept The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
10 -examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
15 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. I, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i .e., dbEST version 1 1 8, gb pri 1 1 8, 
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UniGene version 1 1 8, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 653-1 745. 
Table 1 shows the various tissue sources of SEQ ID NO: 1653-1745. 
The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 118, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable junctions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position® of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its correspondingprotein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 19, gbpri 119, 
5 UniGene version 119, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1746-1768. 
Table 1 shows the various tissue sources of SEQ ID NO: 1 746-1 768. 
10 The homology for SEQ ID NO: 1 746-1768 were obtained by a BLASTP version 2.0al 

19MP«WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 
15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 

Biol., Vol. 6 pp. 219-235 (1 999) herein incoiporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature within the polypeptide sequence. 
20 Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be deteimine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication u 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. Amaximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its correspondingprotein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank(i.e.,dbEST version 120, gbpri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
1 0 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ IDNOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1769-1786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
15 !9MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
20 Biol., Vol. 6 pp. 2 19-235 (1 999) herein incoiporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1 ) 
25 pp. 320-322 (1 998) herein incoiporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-vahie and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
3 0 their cleavage sites can be determine from using Neural Network SignalP V 1 . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by HenrikNielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 

Tissue Origin 



adult brain 



RNA Source 



GIBCO 



Hyeeq 
Library Name 



AB5001 



SEQ ID NDS: 



9 19-21 50-51 65-66 72 78 80 82 
85 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 280-281 295 
298 301 321 326 331-332 334 356- 
357 362 369 379 382-363 416 423 
443 459-460 473 475 477 488 496 
500 503 519 526 547 574 582 587 
608-609 613 618 633-634 64S-646 
652 657-658 560 669-671 678 687 
695 697 710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-989 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1258 1272-1273 
1279 1288-1290 1294 1307-1308 
1312 1320 1323 1330 1356 1360- 
1361 1368 1373-1375 1379 1391 
1400 1417 1446 1468 1482 1493- 
1494 1501-1503 1SD6-1507 1512 
1517 1522-1524 1530-1533 1537 
1549 1565 1578 1598 1606 1608 
1623 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 



GIBCO 



"ABDQ03 [ 3 12-14 18-19 25 30-31 34-36 43- 

45 50-51 56 58 60 65-66 68-69 80 
82 85 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 S34 536- 
540 542-543 545 553 555 S60 569- 
570 574-576 586-588 593 595 597 
601 606-609 616-620 622-623 625 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 750 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 903 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1D78 1081-1Q82 1085-1086 1089 
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Tissue Origin 



RKA Source 



Hyaeq 
Library Name 



SBQ ID NOS: 



adult brain 



1097 
1117 
1134 
1158 
1190 
1217 
1241 
1267 
1289 
1316 
1344 
1374 
1394 
1425- 
1456 
1478 
1497 
1522- 
1548- 
1565 
1591 
1611 
1630- 
1645 
1664 
1686 
1711 
1731- 
1747 
1761 



T103 
1119 
1144 
1167 
1193 
1220 
1243 
1269 
1293 
-1320 
1348 
1377 
1400 
•1427 
1458- 
1482- 
1499 
1524 
1550 
1567 
1593 
1620- 
1632 
1647 
1667 
1690 
1719 
1733 
1749 
1765 



"1107 
1121 
-1145 
1170 
-1194 
1226 
1247 
1279 
-1294 
1326 
1351 
1380 
1409 
1437 
-1459 
-1483 
1506 
1530- 
1552 
1569 
1595 
•1621 
1636 
1649 
1669 
1694- 
1722- 
1738 
1753 
1771 



1109 
1124 
1149 
1178 
1200 
-1227 
1252 
1281 
1306 
1333 
1355* 
1386 
1414 
1443 
1468 
1487- 
1508- 
-1533 
1557- 
1571 
1598- 
1624- 
1640- 
1653* 
1673 
•1696 
1723 
1740 
1757 
1785 



1112 
1127 
1151 
1184 
1202 
1229 
1258 
1284 
-1307 
1338 
-1357 
1389 
1422 
1446 
1470- 
•1488 
•1511 
154S- 
•1S59 
1586 
1601 
1626 
•1641 
•1655 
1678- 
1701 
1726- 
1743- 
17S8 



TIT?: 

1130 
1157- 
1188 
1215- 
1231 
1263 
1286- 
1312 
1341 
1368 
-1390 
-1423 
1454 
-1472 
1493 
1517 
-1546 
1563 
1588 
1608 
1628 
1644- 
1657 
1681 
1709 
1727 
1744 
1760- 



Clontech 



ABR001 



adult brain 



9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 454 469 481 490 
506 517 586 597 631 641 659 691 
715 799 803 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



Clontech 



ABR006 



adult brain 



5-8 15-16 168 212-213 271 278 

280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1252 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1751 



Clontech 



ABR008 



5-10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-58 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Hyseq 
Library Name 



S2Q ID NOS: 



208 210 214-215 218 221-22$ 229" 
231-232 234-241 245-247 251-253 
2S5 2S7-2S9 268-269 271 275-281 
285-286 288 290-292 300-302 304 
307 309-311 313 315 317-318 320- 
322 325-326 328 330-331 333-338 
341 344-347 349 352 354 356-357 
362 365-373 376 379-380 382 384 
387 390-391 393-394 397 399-403 
405-411 414-415 417-420 426-428 
437-438 440-444 4S3-455 462 464 
467 469-471 476 478 492-484 488- 
491 497 S03 506-513 516-517 520 
524-526 528-530 532-534 537-540 
542 544 547-551 553 561 565-567 
572-574 577 581 585 587-588 S90- 
591 597 599 601-602 606-610 612 
615-617 619-620 622-623 628-629 
631 633-634 636-641 643 645-647 
651-6S3 655-664 669-671 673 679 
682 687 689 691-700 702 706 710 
715-717 720-721 72S-734 736-739 
742-743 746 7S0-752 756 758-759 
7S2-764 766 768 773-778 780-782 
734-785 787-789 794 796 799 802- 
803 805 811 814-815 818 825-826 
834-837 839-840 342-843 856-859 
861-862 865 867-B72 874-875 881 
8B3-884.887 889-892 894-B9S 897- 
898 901 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941 943 945 949 953-954 958 961- 
963 967 969 971 975 977 981-983 
986 988-990 992 997 999-1002 
1004-1006 1008 1012 1018-1023 
1027 1029-1031 1035-1037 1047- 
1048 1053 10S7 1059 1063 1068 
1070 1072-1075 1077 1081-1083 
1085-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 114B-1158 1160- 
1163 1167 1169 1172 1175 1177 
1180 1183-1188 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1231 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1259 130S-1309 1312 1314 1316- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1351 
1354-1355 1357-1358 1365-1367 
1369-1370 1373-1374 1376-1379 
1381-1384 1386-1388 1392 1394 
1396-1397 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
1435 1437-1438 1440-1442 1446 
1448 1453-1455 1457 1461 1463- 
1464 1466 1468 1471 1477 1480 
1482-1483 1495 1502-1504 1507- 
1509 1513 1519-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1574 1578 1586-1589 1597-1558 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 1669 1671- 
1674 1676-1684 1686 1639-1690 
1694-1696 1704-1705 1708-1709 
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Tissue Origin | RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult brain 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 17B6 



Clontech 



ABR011 



adult brain 
adult brain 



24 75 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
1059 1204 1609 1731-1732 



BioChain 



ABR012 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



adult brain 



Invitrogen 
Invitrogen 



ABR013 



ABR014 



185 204-205 364-365 393 497 595 
687 692-694 830 845 1068 1320 
1413 1640 



adult brain 



adult brain 
adult brain 



Invitrogen 



187 301 357 364-365 375 454 463 
731 859 939 9B3 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



ABR015 



419 434-435 441-442 763 789 983 
1320 



Invitrogen 
Invitrogen 



ABR016 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



cultured 
pre adipocyt e s 



14-16 22-23 25 37-39 43 S8 60 

70-72 78 86 94 107 113 116 136- 
137 143 146 1S2 161 173 182-184 
194 196 158 210 218 229 259 267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 476 
482 490 502 507-509 516 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 743 
750 753 766 778 780-781 789 803 
814 826 830 837 841 857 869 874 
894-895 925 937 949 954-956 960- 
951 963 968-969 988-989 1000 
1005-1006 1016-1019 1021 1036- 
1037 1052 1086 1090 1109 1113 
1115 1120-1121 1123-1124 1136- 
1137 1140 1144-1147 1151 1167 
1170 1174 1188 1193-1194 1205 
1225 1229 1231 1254 12S8 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 1448 1451- 
1452 1454 1470-1472 1482 1499 
1525 1S28-1529 1532 1S36 1547 
1554 1557-1559 1551-1562 1567 
1585 1588 1590 1595 1601-1604 
1608 1610-1613 1615 1619 1624 
1627 1640 1644 1647 1660 1664 
1666 1670 1675 1696 1704 1715 
1723 1727 1738 176Q-17S1 1768 
1779 1785-1786 



Sfcrategene 



ADP0 01 



5-8 11 17 25 68-69 
105 110 116 136-138 
189 196-198 261 267 
301 318 331 336-338 
400 428 430-431 510 
527 549 557 561 602 
631 637 647 670 681 
748 782 793-794 817 
845 858-859 879 882 
960 982 986 995-995 
1005-1007 1025 1027< 
1039 1045 1071 1078 
1102 1136-1137 1140 



80 82 87 103 
168 171 188- 
276 288 293 
379-380 391 
-512 520 524 
61B 620 622 
682 710 731 
834-836 843 
893-895 934 
1000 1002 
1028 1032 
1097 1099- 
1219-1220 
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Ti33ue Origin 



RHA Source 



Hyseq 
Library 



SEQ ID NOS: 



adrenal gland 



1260 
1222 
1370- 
1437 
1602 
1660 
1711 
1760- 



1271 

1329 

1371 

1466 

1606 

1662 

1719- 

1761 



1297- 

1339 

1398 

1468 

1614 

1673 

1720 

1765 



1298 

1345 

1408 

1533 

1631 

1687- 

1742 

1767 



1314 

1365- 

1423 

1539 

1649- 

1688 

1746 

1771 



1320 
1366 
1431 
1594 
1650 
1695 
1749 
1785 



Clontech 



ADR002 



adult heart 



GIBCO 



AHR001 



4-10 15-16 25 29-31 43-45 47 50- 
51 55 60 62-63 65-66 75 80 102 
116 llfi 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 251 255 267-269 271 280- 
281 285 295 298 311 336-338 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 454 461 473 
477 483 491 493 497-498 503 516 
519 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 703 
713 715 719 732 734 744-746 758 
773-778 789 816 829 837 845 848 
869 87S 683 898 904 912 922-923 
930-931 942 948 952 965 967 969 
976-977 981 990 992-993 1001 
1004 1049 1055 1059 1071-1072 
1076 1112-1113 1115 1121 1127 
1134-1135 1151 1158 11S3 1175 
1181 1188 1209 1218 1224-1225 
1227 1231 1243 1270-1271 1274 
1280 1285 1290 1293 1307 1324- 
132S 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1379 
1387 1398 1400 1405 1417 1425- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 1491 1507 1512 
1538 1546 1S67 1573-1575 1588 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 1674 1678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
1765 



4-8~ 10-11 15-16 18-21 34-39 44- " 
45 50-52 57-58 60 62-63 71 75 82 
85 87 89 94 97 100 103-104 108- 
110 112 114 116 118-119 122-123 
127 130-132 134 136-138 141-144 
147-151 153 163-164 166-171 179 
186 192 195 197 199 204-205 212- 
215 220 225-226 229-230 232 234- 
236 251 257-260 262 265 272 274 
277 280*282 285-286 289-292 296 
298-301 304 307 309 314 321 324- 
325 330 333 336-338 345 349 351- 
352 354 358 361 368 370 380 383- 
384 387-398 391 393 397 401 406 
408-409 411-412 414-416 430-431 
433-439 445-446 449 452 454-455 
457 459 4S2 469 472-473 476-480 
483-484 487^490 492-493 496-498 
503 506 508 510-513 S16 519-522 
526 534 536-540 542 546 549 553 
560-562 574-577 561-582 584 586- 
587 589 593 595 597 604-609 611- 
612 615-620 622-623 626 632 637 
645-652 656-660 665-666 670-672 
674-675 683-684 687 692-694 697 
701 709 712 715-716 719-720 725- 
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Tissue origin 


RNA Source 


Hyaeq 
Library Name 






SEQ 


ID NOS; 










726 


728 730-732 735 


738- 


Tin -t> i 








744 


746 751 753 759 


761 


765 770- 








771 


775-780 785 788 


-790 


796 802 








804 


810 812 817 821 


826 


O 4. O Q J \J 








637 


843 845-847 849 


-853 


857-861 








653- 


864 869 871 875 


B77- 


879 881 








8B3 


887 890-892 834 


-895 


837-898 








901 


903 906-907 911 


-913 


915 919 








921- 


925 927-928 933 


-935 


945 958 








961- 


963 967 969-972 


975 


977-978 








990- 


986 990 992 999 


-1002 


1005- 








1007 


1010 


1016 


1019 


-1020 


1022- * 








1023 


1025 


1028 


-1037 


1039 


— 1040 








1043 


1047 


1050 


1054 


-1055 










1059 


1063 


-1064 


1067 


-1068 


1U /U 








1072 


1075 


-1076 


1083 


1085 


- 1087 








1089 


1093 


-1094 


1104 


1106 


1108 - 










1113 


1116 


-1117 


1119 


1121 








XX*&*x 


1126 


112 S 


1131 


-1134 


1144- 








XXH 3 


1148 


-1149 


1151 


1158 


1167 








1169 


-1170 


1175 


1177 


1192 


1195 








1199 


-1200 


1202 


1206 


-1208 


1211 








1216 


1218 


1222 


1227 


-1229 


1232- 








123 5 


1238 


-1241 


1243 


-1244 


124 7- 








1248 


1250 


1253- 


-1254 


1256 


-1258 








1261 


1268 


1270-1271 


1277 


1280- 








1282 


1287 


1292 


1298 


-1299 


1306 








1308 


1317-1321 


1324 


-1325 


1330 








1332 


1334-1337 


1339 


1344 


-1345 








1349-1350 


1354- 


-1356 


1359 


-1360 








1365 


-1366 


1369 


1371 


1374 


-1375 








1378-1380 


1383-1384 


1389 


1397 








1400 


1403 


1409 


1417 


1423 


-1426 








1437 


1439 


1442 


1444 


1446 


-1447 








1450 


1453 


1468 


1470 


1473 


1479 








1481 


1488 


1490 


1501- 


-1504 


1519 








1521 


1524 


1528 


1530- 


-1534 


1536- 








1537 


1539 


1541- 


1542 


1547 


15S3 








1555 


1560* 


1565 


1567-1571 


1588 








1591 


1597-1598 


1601- 


-1602 


1605 








1614- 


1616 


1619- 


1620 


1623- 


-1628 








1S30-1632 


1634 


1636 


1641 


1644- 








1645 


1647 


1649 


1652-1655 


1659 








1662 


1667 


1673- 


1674 


1680-1681 








1684 


1686- 


1688 


1704- 


170S 


1709 








1711- 


1712 


1717 


1724 


1726-1727 








1731- 


1733 


1737- 


1738 


1741 


1743- 








1744 


1749 


1754- 


1755 


1760- 


1761 








1765 


1772 


1785 








adult kidney 


GIBCO - 


AKD001 


4-8 1 


0-11 


17-21 


29-31 35 : 


35 42- 








45 SC 


-51 56-58 


60-61 


64 68-69 75 








77 80 


82 85 87 


92-94 


97 100 102- 








104 107-108 112 


116- 


117 119 123 








127-133 136-137 


139- 


141 143-144 








147-154 157 161 


-163 


165-166 169 








172 176 178-179 


192 


194-197 199 








201 203-206 209 


-210 


212-213 215- 








216 223-228 234 


-236 


238 247 251- 








253 257-259 261 


-262 


26S-269 271- 








272 274 276-277 


279- 


281 234-286 








290 293 29S 298 


-299 


301-302 304 








307 311-313 321 


325- 


326 329-331 








333 341 344 348 


-350 


352 356 358- 








359 362 364-365 


36B 


370-372 374 








376-377 380-382 


392 


395 3 


98 400- 








401 4 


04 40 


7-409 


414- 


415 423-424 








430-437 443-444 


446 


349 451 453- 








455 459 461-462 


464 467 4 


69 471- 








474 476-477 480- 


-481 483 487-488 
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Hyscc[ 






SEQ 


XD NOS: 








^jj.ujLCiLy name 




















/OA 


491 493 497-505 


510- 


51J 515- 








on 


522 524 526-529 


W» 


537*540 








544 


547 549 SS4-SS6 




562 564 








567 


571-576 57 


8 582 


CDC, 
OOO - 










593 


598-599 601 604 


-606 


<?r A- ci 








615- 


619 621-62 


6 632 


- O J » 


637-643 








645- 


652 655 660-664 


669- 


fi*75 fi7fi 
O / « O ft) 








579- 


679 688 692-695 


698 


702 711 








713 


717 719-72 


0 727 


731 


735*- 736 








73 8 


743 745-746 751 


f 5J 










763 


765 771-773 775 


- 770 


/ou /Ob 








789 


793 795-796 800 


ovM 


one ana 








810- 


812 814 -819 821 


aZ. b 












838 842-845 848 


.BBC 


QC7_ RC1 
OD /- OOl 








O D *4 — 


865 867 869 871 


874 


o /fa - oo3 








one 
Bob - 


887 889-891 893 


-896 


aqq art n 

890-900 








902 


906-908 91 


0-9.14 


91B 


920 922 








925- 


927 929-93 


5 937 


940- 


942 945 








948- 


949 951 953-958 


960- 


961 963- 








964 


969-970 972 976 


-978 


982-986 








988- 


990 992-993 995 


-997 


999-1002 








1004 


-1008 


1010 


1012 


-1013 


1016- 








1017 


1019 


-1020 


1022 


1025 


-1031 








1035 


1038 


-1040 


1042 


1044 


1047 








1050 


1054 


-1055 


1057 


-1064 


1068 








1070 


-1073 


1078 


1085 


-1086 


1088- 








1089 


1092 


1094 


1097 


1099 


-1102 








1107 


1109 


-1112 


1116 


1119 


1121 








1123 


-1125 


1132 


-1135 


1140 


1142- 








1143 


1146 


-1147 


1149-1150 


1153- 








1154 


1157 


1159 


1163 


1167 


1170 








1178 


-1179 


1181 


1183 


1192 


1196- 








1200 


1202 


-1204 


1206 


-1211 


1216- 








1219 


1221 


-1222 


1225 


1227 


-1230 








1232 


-1234 


1238 


-1241 


1243 


-1244 








1246 


-1247 


1253 


1257 


-1258 


1260- 








1261 


1267 


-1268 


1270 


1272 


-1274 








1281 


1283 


1287 


-1239 


1293 


-1295 








1299 


1306 


1308 


1311-1313 


1317- 








1320 


1323 


1329 


-1330 


1334 


-1335 








1339 


1341 


1349 


-1350 


1353 


-1357 








1359 


1367 


1369 


1373 


1375 


1378- 








1.379 


1394 


1397 


1400 


1403 


1405 








1407 


-1409 


1417 


1419 


1423 


-1424 








1428 


-1431 


1433 


1437- 


•1438 


1442- 








1443 


1445- 


-1446 


1448- 


•1450 


1453- 








1454 


1459 


1461 


1465- 


•1468 


1474- 








1475 


1478 


1484- 


-1488 


1490 


1492- 








1493 


1495 


1497- 


•1498 


1506 


-1507 








1509 


1512 


1518 


1521- 


•1522 


1525 








1527 


-1528 


1532- 


•1533 


1537 










1541 


1547- 


1550 


1552 


1556 


- 1559 








1551 


1565-1566 


1568 


1571 


juo / ^ 








1578-1579 


1583 


1586-1587 


1589 








1591-1592 


1594 


1598 


1600 


1603- 








1604 


1606 


1608 


1611 


1613 


1615- 








1616 


1618-1622 


1624- 


1628 


1631- 








1632 


1634- 


1636 


1638- 


1639 


1641 








1644 


1646- 


1649 


1653- 


1656 


1662 








1664 


1666- 


1667 


1670- 


1671 


1676- 








1679 


1683- 


1684 


1686 


1691 


-1692 j 








1696-1699 


1701 


1709- 


1711 


1713- 








1714 


1716- 


1719 


1723- 


1724 


1726- 








1727 


1733 


1737- 


1738 


1741 


1743- 








1744 


1748- 


1749 


1751 


1760- 


•1761 








1763- 


•1768 


1778 


1780 


1785 




adult kidney 


Itivitrogen 


AKT002 


20-21 37-3 


9 47 


52 57 


60 65-66 








68-69 80 104 107-108 


122 


130 133 








136-137 140 142 


-143 


149 169 174 
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Tissue Orxgm 



RNA Source 



Hyseq 
Library 



SEQ 10 NOS: 



adult lung 



181 197 227-228 235-236 244 2£l~ 
261-265 267 280-281 286 290 299 
301 304-305 309 312-313 339 341 
344-345 349 358 370-372 376 382- 
383 307 392 401 414 416 421 430 
443 445 449 453-454 472 497-488 
504 506 513 S16 515 522 528 536- 
540 516 554 585 587 594 598 602 
607 616-617 626-627 636 643 662- 
664 695 709 721 735 743 761 768 
775-777 788 796 804 814 827 837- 
838 849-850 852-853 869-870 881 
890-892 898 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 105S 1063 1067-1068 
1073 1085 1099-1102 1107 1110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1159 
1192 1196 1199 1232-1233 1241 
1256 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 
1356 1369 1378-1379 1403 
1419 1428-1429 1436 1446 1458 
1463-1464 1467-1468 1470 1477- 
147B 1486 1491 1509 1519 1527 
1529 1534 1547 1596 1600 1619 
1623 1629 1631 16*34 1638 1643 
1647 1652 1660 1664 1667 1669- 
1670 1673 1686 1709 1727 1740 
1776 



1355- 
1414 



GIBCO 



ALG001 



lymph node 



4-8 14 37-39 44-46 
63 75 82 88 93 103 
133 140 143 150 152 
171-172 174-175 190 
211 214 219 223-224 
252 256 265 272 274 
310 332 345 351 362 
394 408-409 431 436 
461 467 469 471 476 
513 527 537-540 544 
564 583 607 516-617 
634 645-646 662-664 
719 743-744 763 766 
811 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 981 987 990 992 
1005-1006 1014 1017 
1054 1059 1062 1064 
1086-1089 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1364 1404 1409 1423 
1442 1474 1478 1494 
1525 1531-1532 1547 
1554 1571 1598 16C6 
1627-1629 1632 1642 
1569 1676-1677 1684 
1731 1732 1737-1738 
1786 



50-51 56 62- 
104 113 125 

154 157 162 
-191 196 200 
227-228 251- 
280-281 285 
371 381-382 
445 454 459 
-477 488 504 
547-548 554 
621 623-624 
670 695 716 
774 789 803 
837-838 84S 
866 880 887 
966 971 977 
996 1001 
1045 1047 
1072 1080 
1126 1134 
1157 1173 
1241 1272- 
1306 1320 
1379 1383- 
1434 1436 
1509 1522 
1549 1553- 
1613 1624 
1644 1662 
1696 1727 
1748-1749 



CI on tech 



ALN001 



4 24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 280- 
231 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 481- 
482 S03 526 529 537-540 546-547 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



young liver 



GIBCO 



ALV001 



adult liver 



Invitrogen 



ALV002 



621 626 649 679 719 725-726 738" 
793 803 831 834-836 838 844 857- 
858 866 379 905 913 928 963 975 
1005-1006 1012 1038 1050 1116- 
1117 1151 1199 1204 1226 1243 
1265 1274 1324-1325 1339 1353 
1374 1377 1440-1441 1447 1504 
1549 1600 1618-1619 1631 1641 
1644 16S3 1687-1688 1691-1692 

1741 1771 

5-8 11 20-21 46 50-51 SB 65-66" — 
75 79 92 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 290-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 478 483 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 601-602 
607 621-624 628-630 632-633 637 
648 660 566-667 678 697-698 700 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 812 
814 841 349-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1085 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-114S 1156-1157 1159 1196 
119^-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1S6S 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1763-1765 1769 



5-8 17 20-21 32-33 41 55 58 64 
75 77 66 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 S52 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 77B 782 
794 814 820 826 834-837 847 849- 
850 358 861 874 879 893 898 904 
911 918 921-922 925 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
11S9 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 



113 



WO 01/53312 



PCIYUS00/34263 



Tissue Origin 



adult liver 
adult ovary 



RNA Source 



Library Name 



SEQ ID NOS: 



T3sT 
1597 
161B- 
1647 
1669- 
1738 
1765 



156T 

1601- 

161S 

1652 

1671 

1742- 

1772 



1578 15B1 
1602 1611- 
1621 1625 
1654-1655 
1684 1706 
1744 1760- 
1774 



1583 1594- 
1612 1615 
1637 1645 
1660 1656 
1722 1737- 
1761 1753- 



Clontech 



ALV003 



29 676 997 1063 1119 1536 1766 



Invitrogen 



AOV001 



1 4-1820-23 29 35-40 42-48 50- 
51 53-5B 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-173 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-2B6 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 499- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 566-567 569-570 
572-573 575-576 579 581 503 585- 
588 590-591 593 595 597 599 601- 
60S 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-655 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 78e 790-791 794-796 
800 803 805 809-811 813-815 818- 
819 821-824 826 82B-829 831-832 
837-838 843-850 852-857 859-864 
867 869 871-872 874-875 878-883 
887-886 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 11C6-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-1185 1190-1191 1195 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
-1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 129B- 
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Tissue Origin RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1299 
1323 
1338 
1359 
1377 
1394 
1427 
1443 
1463 
1481 
1494 
1507 
1526 
1538 
1553 
1567 
1578 
1591 
1609 
1636 
1657 
1671 
1690 
1713 
1726 
1738 
1751 
1765 
1778 



1306 
1327 
1339 
1361 
•1379 
1400 
1429- 
1445- 
1464 
1484- 
1496- 
1511- 
•1527 
1539 
1555- 
1569- 
1SBU- 
159S 
1611* 
1638 
1659- 
1673- 
1695 
•1714 
•1728 
1740- 
1753 
1767- 
•1779 



1308 

1329- 

1341 

136S- 

1383- 

1404 

1431 

1450 

1466 

1485 

1498 

1517 

1530- 

1541 

1559 

1570 

1581 

1597- 

1621 

1641 

1662 

1674 

1702- 

1716- 

1731- 

1741 

1755- 

1768 

1783- 



1312 
1330 
1343 
1366 
1384 
1416 
1435 
1453 
146B 
1488 
1501 
1519 
1531 
1546 
1561 
1572 
1587 
1598 
1623 
1643 
1664 
1676 
1707 
1719 
1733 
1743 
17S6 
1770 
1784 



1317- 
1332- 

•1351 
1371- 
13B6 

•1417 

-1436 

-1454 
1470 
1491 

•1504 
1521- 
1534- 
154 8- 

-1563 
1574- 

-1583 
1600- 

•1630 
1645 
1667 

-1681 
1710- 
1723- 
1735 

-1744 
1760- 

-1771 
1786 



1321 

1333 

1356 

1375 

1389 

1422- 

1439- 

1459 

1474- 

1493- 

1506- 

1524 

1536 

1550 

1556- 

1575 

1590- 

1606 

1634 

1647- 

1669- 

1683- 

1711 

1724 

1737- 

1748- 

1762 

1776 



adult "placenta 



Clontech 



APL001 



5-6 44-45 90-91 107-108 159 171 
311 351 414 476 503 545 574 624 
636 719 755 773 860 890-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



APL002 



14-16 26 29 43 60-6 
106 116 135 171 177 
198 210 216 235-236 
309 329 334 339 359 
423 430 434-435 448 
491 517 522 631 723 
738 746 769 818 843 
858 916 948 9S3-954 
1005-1006 1013 1033 
1068 1070 1086 1139 
1160 1277 1285 1317 
1345 1429 1435 1438 
I486 1490 1512 1519 
1592-1593 1602 1626 
1664 1673 1675 1722 
1746 1776 



placenta 



Invitrogen 



1 79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 988-989 
1036 1064 
1144-1145 
-1320 1343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



adult spleen 



GIBCO 



ASP001 



3 5-8 12 15- 
44-45 57 60 
103 106 108 
147 152-153 
178-180 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 574- 
611-612 620- 
652 659 651 
700 721 728 
746 762 765 
810-B11 817 
852-853 658 



16 19-21 24 
82-83 87 89 
117 119-121 
155 166 169 
198 201-206 
253-254 256 
290 295 302 
349 358 372 
414 431 434 
481 490-493 
530 534 536 
S76 582 S92 
621 623 631 
667 671 673 
730 732 738 
774 780 788 
822 830 832 
862 866 874 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 386- 

•436 446 
500 503 
540 547 
595 604 

•632 642 

■675 684 
742-744 

•789 794 
845 848 
879 882 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



884 906-908 912 919 
927 934 942 949 957- 
978 983 990 992-994 
1005-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1109 1113 1115 1124 
1170 1174 1177 1190 
1220 1226-1227 1229 
1246 1258 1269 1271 
1301 1320 1322 1330 
1339 1349 1351 1353 
1364 1369 1374 1386 
1417 1434 1436-1437 
1474 1477 1480 1485 
1512 1522 1525 1544 
1560 1567 1591 1600 
16S1 1654-1655 1658 
1674 1678-1679 1684 
1727 1733 1738 1740 
1761 1774 1779 1781 



921-923 926- 
958 963 977- 
996-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1468 
1487 1498 
1549 1553 
1631 1636 
1662 1670 
1686 1700 
1741 1760- 
-1782 



testis 



GIBCO 



ATS001 



5-8 10 26 30-31 47 50-51 57 68- 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-289 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 481-482 493 499 502- 
503 513 522 526 547 552-553 563- 
564 572-573 575-576 581-582 585 
599-602 605 612 615-617 620 631 
637 647 649-650 656 660 665 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 784 78J8-789 
802 804 809 811 814 926 831 837 
843 845 848 859 866 869 877 905 
913 916 919 921 926 929 937 950 
960 963 971 975 977 9B1 990 992- 
993 1007 1016 1029-1030 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 1072-1073 1087 10B9 
1097 1099-1102 1104 1108 1113 
1141 1149 1161-1162 1175 1208- 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 1345 1369 
1373-1374 1379 1389 1399-1400 
1409 1423-1424 1430 1435-1437 
1443 1459 1484 1486 1490 1493 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1563 1565 1567 1569 1571 
1577 1586 1591 1599 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1661-1662 1666-1667 1670 
1675* 1684 1690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 

1767 1779 

686 1352 1412 



Genomic DNA 
from BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



BAC001 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



1411-1412 
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Tissue Origin 



Genomic DNA 
from BAC 35316 



adulc bladder 



bone marrow 



RNA Source 



Research 
Genetics 
(CITB BAC 
Library) 
^Invitrogen 



clontech 



Hyseq 
Library Name 



BAC003 



BMD001 



SEQ ID NOS: 



1352 



S-8 17-18 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 363 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 789 837 840 866 893 898 
909 918 929 966 977 983 1016 
1025 1055 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



3-8 11 13 13 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 151 163 168-170 172 17B-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 468 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-54S 549 555 565 567 
569-577 581 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 7S2 761 765 772- 
773 775-778 780 785-786 789-791 
796 758 802 810-812 823-824 826 
830 832-833 837-838 843-844 84B- 
855 858-859 866-867 869 878-880 
883 890-892 896 9C3 90S 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 587 950 552 55S 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 103S 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 13S1 1367 
1369 1372-1374 1379-1380 1354 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1453-1464 1482 1486 1493-1494 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID DfOS: 



1506 


1*09 1513 


1521-1522 


1524 


1526 


1528 


1531 


1536 


-1537 


1543 


1546 


1548- 


•1549 


1552 


1554- 


-1555 


1557- 


1559 


1571- 


-1572 


1591 


1589- 


1592 


1597- 


•1600 


1609 


1614 


1621 


1626- 


•1628 


1630- 


•1632 


1634 


1636 


1538- 


•1639 


1641 


1646-1647 


1651 


1653- 


1655 


1661- 


-1662 


1676 


-1681 


1684 


1686 


1690 


1702 


1707 


1711 


1713- 


•1714 


1717 


1720 


1722 


-1723 


1727 


1737-1738 


174 0 


175B 


1767 


1772 


1781-1782 


1785-1786 





bone marrow 



Clontech 



TT 15-16 19 30-31 35-36 68-69 75 
33-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 350 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 5S2 566 
569-570 5B1 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 B30 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 101G 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1251 1282-1283 
1285 1287 12.95 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 135S-13S7 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1550 1573- 
1574- 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 

1786 

73-74 503 922 1036 1711 

95-96 866 1320 1475 



BMD002 



bone marrow 
bone marrow 



Clontecb 



BMD004 



adult colon 



Clontech 
Invitrogen 



BMD007 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1452-1464 1^12 155* 1*83 1587" 
1594 1596 1514 1625-1626 1631 
1539 1645 1S50 1575-1677 1687- 
1688 1701 1713-1714 1724 1740 

1765 ^ 

Toi 1490 1686 ' 



Mixture of 16 
tissues - 
raRNAs 



Various 
Vendors 



CTL016 



Mixture o£ 16 
tissues - 
mRNAs* 
adult cervix 



Various 
Vendors 



CTL021 



312 782 1132-1133 1403 1712 1715 



BioChaxn 



CVX0Q1 



1 4-8 11 13 1B-21 2^-26" 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 34 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 132 13S- 
1S6 138 201-202 218-219 222 229- 
231 257 266 275-277 285-286 288 
298 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 557 S61 S72-S73 575-577 581- 
582 5B5-566 5B8-589 593-594 600 
602 604-6C5 607-609 612 615-619 
623 644 550 554 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 831- 
832 834-936 843 847-848 851-8S5 
857-860 864-866 869 871 876 878- 
880 882 887 890-891 897 833-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 103B 1045 1047 2053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



The 16 tissue-mRNAs and their vendor source, are as follows: 1) Norma! adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Mviirogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Ciontech), 9) human bone marrow mRNA 
(Ciontech), 10) human leukemia rymphaWastic mRNA (Ciontech), 11) human thymus 
mRNA (Ciontech), 12) human lymph node mRNA (Ciontech), 13) human spinal cord 
mRNA (Ciontech), 14) human thyroid mRNA (Ciontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID NOS: 



diaphragm 



BioChain 



DIA002 



1406 
1437 
1466 
1503 
1531 
15B5 
1609 
1626- 
1649 
1674 
1702 
1724 
1741 
1760 
1786 



1416 
1442 
1472 
1506 
1533 
1589 
1614' 
1628 
1653 
1675 
1709- 
1729 
1743- 
1762 



T42T 
1446 
1478 
1512 
1541 
1597- 

•1616 
1630 
1656 
1683 

•1710 
1731- 
1744 
1767 



1448 

1482 

1522 

1547 

1598 

1620 

1638 

1662 

1685- 

1715 

1732 

1748- 

1773 



1431 

1453 

1496 

1527- 

1569 

1600 

1623* 

1641 

1667 

1688 

1717 

1735- 

1749 

1778 



1436- 

1459 

1501- 

1528 

1571 

1608- 

1624 

1643 

1669 

1699 

1722 

1739 

1755 

1785- 



137 2 
1478 



82 289 730 "7 80 
15.99 1614 



986 1409 



endothelial 
cells 



Strategene 



EDT001 



3 5-10 13 15-21 24-26 29 34 37- 
39 42 44-45 50-51 53-55 57-58 
60-61 65-66 68-69 73-74 77-78 80 
82-83 85 87 89 93-96 101-105 108 
110 112-114 116 118-122 124 128 
133-134 137-142 147-150 152-153 
161-163 166-172 176-179 187 190 
192 194 196-201 204-207 210 212- 
214 220 224 223-230 233 235-236 
240-241 2S1-252 258 261-262 265 
267-269 272 276-277 279-281 284- 
28S 288 290 29S-296 301-302 310- 
311 313 316 321 325 329 331-333 
335 340-342 351-355 360 371 37S 
380-382 364 387 390 392 397 400 
407-408 410 412 414 416 425-427 
431 434-436 439 444-445 449 454 
463-464 472-475 477-479 486 488- 
490 497-498 500-504 510-513 516- 
515 522 524 526-528 532-534 536- 
540 542-546 548 561-563 566-567 
572-576 579 581 585-586 589 593 
595 597 599 603 607-612 615-617 
620 622 626 630 632-634 638-641 
644 647 656-660 662-664 670 673 
678 680-682 692-697 707 709-710 
712-713 719 730 732 734 736 738 
743-746 751 759 768 771 773 775- 
778. 783 786-789 793 800 803 805- 
807 810-811 814 816-818 821-822 
824 826 828-829 832 834-838 842- 
845 848-850 854-B60 862 864 869 
B71 874 876-879 883 885 887 890- 
891 894-895 898-900 903 908 910- 
913 916 919-922 924 926-928 930- 
935 939 943 940-949 951-954 957 
959-961 964 969-970 973 &7S-978 
983-984 988-990 992-993 996-997 
1000 1002 1004-1013 1016-1020 
1022-1025 1028 1031 1033-1034 
1038-1046 1050 1055-1056 1059- 
1060 1062-1064 1067-1070 1072- 
1074 1076 1078 1082 1086-1087 
iOB9-1090 1093-1097 1099-1103 
1107 1109-1113 1116-1117 1124- 
1126 1128-1131 1134-1135 1138 
1140 1144-1145 1148-1149 1153 
1157 1160 1163 1171 1183-1184 
1198-1199 1202 1205-1207 1211 
1216-1217 1219 1221 1225 1229 
1232-1235 1238-1241 1243-1244 
1246 1250 1253 1257-1258 1261 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1265-1266 1268 1270-1271 1274- 
1277 1280-1283 1285-1286 1288- 
1290 1253 1295 1298 1308 1312 
1317-1320 1324-1325 1327 1329- 
1330 1334-1335 1338 1342-1343 
1345-1347 1350 1355-1356 1359 
1367 1369 1374 1376 1379 1398 
1400 1406 1408 1414 1417 1419 
1424-1426 1428-1431 1434-1438 
1440-1442 1446 1450 1462-1466 
1468 1472 1474 1478 1487-1488 
1491-1493 1501-1504 1S06 1509 
1511 1516 1520-1521 1526 1529 
1531 1536-1537 1539-1540 1546- 
1547 1549 1552 1555 1557-1559 
1561-1565 1568 1571 1575 1578- 
1S79 1581-1583 1587-1588 1590 
1592 1597 1605-1606 1611 1613 
1615 1618-1621 1624-1628 1630- 
1631 1634 1636 1638 1641 1643- 
1650 1652-1659 1664 1666-1667 
1669 1671 1675-1681 1683-1688 
1696-1598 1703 1711 1715-1716 
1719 1722-1723 1726 1731-1733 
1736 1739-1741 1743-1744 1749 
1755 17G0-1761 1765 1767-1768 
1771-1773 1776 1779 1783-1786 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



286 686 1297 1303-1304 1352 
1411-1412 1754 



esophagus 



BioCham 



ESO0O2 



131-132 261 289 380 503 B60 892 
1000 1007 1397 



fetal brain 



Clontech 



FBR001 



62-63 89 112 126 194 322 336-338 
379 391 411 4 81 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



fetal brain 



Clontech 



FBRQ04 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1593 



fetal brain 



Clontech 



FBRQQ6 



5-9 25 43 60 62-63 65-66 70 72 
80 87 92 101 103 108 114 13$ 139 
149 152-153 157 168 171-172 175 
207-208 210 212-213 221-226 237- 
238 251-253 266 272 279-281 295 
301-302 307 310 317-318 321-324 
330 333-334 336-338 346-347 352 
357 370 373 377 379-380 382 384 
391-392 397 399 402 406-408 410- 
411 417 421 424 426-427 430 436- 
437 440-443 454 460 464 467 473 
476 483 488-489 495 497 508 510- 
513 516 519-520 524 530 537-540 
544 547 550 561 567 572-574 582 
590-591 595 597 604 607-609 615 
623 628-629 631 634 638-640 655 
657-658 660 665 669 674-675 679 
689 691-694 696-697 699 701 706 
710 716 720 728 732 734 736 742- 
744 757-760 763 775-778 780 799 
806-807 810 817-818 826 839 843 
858 861 864 871-872 884 890-891 
894-895 898 904 915 921-923 935- 
936 938 945 950 952 955-956 958- 
959 961 963 967 969-971 990 992 
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SEQ ID W6S: 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



999 106l 1005-1006 1008 1013 
1016 1022 1024 1029-1030 1032 
1035 1042 1047-1048 1052 1056 
1065 1067 1070 1082 1089 1109 
1114-1115 1119 1131 1143-1149 
1151 1153-1156 1160 1163 1167 
1172-13 73 1178 1184 1186 1188 
1190-1200 1211 1216 1222-1223 
1226-1227 1229 1231 1236 124S 
1253-1255 12S8 1260 1262 1266 
1270-1273 1281 1287 1308-1309 
1314 1317-1320 1326 1334-1335 
1339 1341 1344 1350 1356 1369- 
1371 1373 1376 1379 1381-1382 
1386 1392 1396-1398 1419 1423 
1425-1426 1428-1429 1432 1437 
1440-1441 1448 1466 1470 1482 
1502-1503 1507 1511 1513 152.6 
1519 1536 1544 1549-1550 1557- 
1559 1573 1589-1590 1598 1608 
1611-1614 1619 1621 1625-1626 
1640 1651 1657-1658 1676-1679 
1693 1696 1703-1704 1713-1714 
1718 1720 1722 1724 1726 1728 
1730-1733 1735-1736 1738-1739 
1742 1745 1755 1759-1761 1765 
1767 1771-1772 1777 1779-1780 
1786 



235-236 520-864 10*8 1188 1587 



fetal brain 



Clontech 



FBRS03 
FBT002 



fetal brain 



Invitrogen 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 251 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 S97 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 854-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1O0O-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 107B 1082 1085 
1090 1109 1115 111B 1120'll28 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 1757 
17S9-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal heart 



Invitrogen 



105 124 160 289 864 1036 1148 
1229 1614 1616 1762 1785 



fetal kidney 



Clontech 



PKD001 



5-8 11 40 47 57 65-66 82 85 102 
124 263 171 216 222 224 235-236 
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Tissue Origin | RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal kidney 



Clontech" 



FKDQ02 



258 277 280-281 307 310 314 330 
371 387 392 39S 403 422-423 431 
436 443 4S5 469 500 519 522 542 
563 572-573 5B5 600 619 623 650 
654 657-658 660 679 719 731 780 
798 821 833 844 854-855 857 864 
868 878 511 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1441 1470 1543 1598 1601 
1618 1631 1651 1654-1655 1669 
1678-1679 1691-1692 1733 1785 



fetal kidney 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



fetal lung 



Invitrogen 
Clontech 



FKD007 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



FLG001 



fetal lung 



Invitrogen 



35-36 94 323 371 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 157$ 16*20 1686 



?LG003 



fetal lung 



9 15-16 29 41 47 68-69 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 254 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 425-427 430 432 467- 
468 475 483 488 493 516 531 535 
54S 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 728 
761 766-767 774 80S 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1165 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 1355 1369 1331 1413- 
1414 1431 1438 1449 1491 1512 
1536 1547 1557-1560 1557 1590 
1601 1636 1644 1653-1655 1662 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



Clontech 



FLG004 



fetal liver - 
spleen 



Columbia 
University 



FLS001 



103 276 334 
1514 1658 



465-466 737 B43 1131 



3-11 13 15 
51 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233-236 240 
255-256 258 
274 276-278 
293 295 299 
311 314 316 
332 342 344- 
358 360 362 
386-387 390 
406 408 410- 
437 439-442 
456 459 461- 
487-488 490- 
506 S09-513 
529 531 534 
553-554 561- 
576 579 581 



21 25 30 
60-66 6 
85 87 8 
124 126 
144 147 
167-172 
190 193 
210-214 
-244 246 
261-265 
280-281 
•301 304 
318 320 
•345 350 
370-374 
392-393 
412 415 
444-445 
470 472- 
491 493 
515-520 
536-540 
562 564 
583 585- 



39 41-4 
8-69 72 
9 92-103 
-127 130 
-149 152 
174 176 
-194 196 
219 221 
-247 250 
268-269 
284-286 
306-307 
-321 326 
352-353 
376 378 
400-401 
417 419 
448 452- 
-479 481- 
500-501 
522-524 
542 547- 
567-568 
■597 599- 



8 50- 
75 
105- 
133 
-153 
178 
198- 
-231 
-251 
272 
288 
309 
329- 
356- 
•384 
403 
422- 
454 
463 
503- 
526- 
54 9 
571- 
605 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ ID NOS: 






Library Name 




















507 610-613 615 


-621 


623-624 626 








628-634 636-640 


644 


647-650 655- 








660 66S 669-670 


672 


674-675 678 








681-682 684 690 


-695 


697 702 708- 








710 713-714 716 


-719 


725-728 730- 








731 734 736 738 


740- 


741 743-746 








748 750-751 759 


-766 


768 772 7*74- 








777 779 783-788 


793 


796 798 800- 








805 808 010-812 


814 


810-8 


19 821- 








824 826-832 834 


-837 


843-8 


47 849- 


• 






867 869-876 878 


-883 


8B7 8 


89-895 








897-898 902 904 


-914 


916 919 921- 








928 930-937 939 


945- 


950 953-958 








960-961 963-965 


967 


969 971 974- 








978 980-983 986 


988- 


990 992-993 








995-997 1000-1002 1004-1006 1012 








1014 


1016- 


1019 


1025- 


1026 


1028- 








1031 


1033 


1035- 


1036 


1039- 


1044 








1047 


1049- 


1050 


1053-1056 


1058- 








1059 


1061- 


1064 


1067- 


1070 


1072- 








1074 


1076 


1078 


1082 


1085- 


1087 








1089- 


1090 


1097 


1099- 


1103 


1107- 








1113 


1115- 


1119 


1121- 


1123 


1125 








1127- 


1128 


1131- 


1134 


1136- 


1137 








1144- 


1150 


1153 


1159-1160 


1163 








1170 


1175 


1177- 


1178 


1188 


1190- 








1192 


1195- 


1200 


1202 


1206 


1208- 








1211 


1214 


1216 


1218 


1221-1222 








1225 


1227 


1234 


1237 


1241 


1244 








1246-1247 


1251 


1254 


1258 


1261 




• 




1266 


1268 


1270- 


1273 


1277-1282 








1284-1285 


1287-1290 


1294 


1299- 








1300 


1306-1308 


1313- 


-1320 


1324- 








1325 


1327 


1330 


1332 


-1333 


1338 








1341 


1343 


1345- 


•1347 


1349- 


-1350 








1353- 


-1360 


1362-1363 


1365- 


-1367 








1369- 


-1370 


1372- 


-1374 


1376 


1378- 








1381 


1383- 


-1384 


1386 


1389 


-1391 








1400 


1402 


-1403 


1405 


-1410 


1413 








1415 


1417-1419 


1422 


-1429 


1431 








1435-1437 


1439- 


-1442 


144S 


-1446 








1448-1449 


1454 


1458 


-1459 


1466- 








1470 


1472 


1474 


1477 


-1478 


1480 








14B2 


1485 


1491-1493 


1496 


-1498 








1501 


-1507 


1509 


1511 


-1512 


1516- 








1519 


1524 


-1526 


1529 


1532 


1536- 








1541 


1S46 


-1547 


1549 


-1550 


1552- 








1554 


1562 


1564 


1569 


1572 


1574- 








1575 


1578 


1581 


1583 


1537 


-1568 








1591 


-1592 


1594 


-1595 


1597 


-1598 








1600 


-1604 


1611 


-1612 


1614 


-1615 








1617 


-1618 


1620 


-1622 


1624 


-1625 








1627 


-1628 


1630 


-1632 


1634 


-1639 








1645 


-1651 


1653 


-1662 


1664 


1667- 








1669 


1671 


1673 


-1674 


1676 


-1688 








1690 


1696 


1701 


-1703 


1706 


-1709 








1711 


1713 


-1714 


1718 


-1719 


1722 








1724 


-1727 


1731-1733 


1738 


1740- 








1741 


1743 


-1744 


1746 


1748 


1751- 








1752 


1754 


1760 


-1765 


1767 


-1773 








1780 


1783 


-1786 








fetal liver- 


Columbia 


PLS002 


3-11 


13 15-21 


25 29 


32 35-39 42 


spleen 


University 




44-45 48 


50-51 


54-55 57- 


58 61 54 




68-69 73- 


75 78 


30 82 84 


87 95-98 








100 


103 105 107-108 


110 


112-113 








116- 


119 122-125 128 


130 


137-138 








145 


147-153 155 157 


159 


161-163 








166 


168 171-172 174 


-175 


177 181 








188- 


189 193-194 196 


-198 


200-203 
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Tissue Origin 


RNA Source 


Hyseq 








SEQ ID NOS: 










Library Name 
























206 


212- 


215 219 


-221 


223 


225- 


229 








231- 


232 


240-244 


246- 


247 


250- 


251 








258-259 


262 264 


268- 


269 


272 


275 








277 


280- 


281 284 


286 


268 


290- 


292 








295 


298- 


299 301 


-304 


306 


308- 


310 








318 


320- 


321 323 


32S 


329 


331 


334 








342 


348- 


34 


9 352 


-353 


356 


359 


368 








371 


374 


376-379 


381- 


3 84 


386- 


387 








392- 


393 


397-398 


400- 


401 


403 


410- 








413 


421 


423 426 


-427 


429- 


-43 0 


433- 








436 


438 


440 443 


445 


448 


451- 


-452 








4S4-4S5 


460-463 


465- 


467 


469 


471- 








473 


475- 


•476 478 


-479 


481-483 


487 








490- 


■491 


493-494 


497 


500- 


-501 


503- 








505 


509-513 515 


-517 


519-520 


524 








526-531 


534 537 


-542 


544 


547 


552- 








554 


556 


558 561 


-562 


564-567 


571- 








577 


583-587 590 


-S91 


593 


595 


597 








601 


604-606 608 


-613 


616-617 


619- 








624 


626-632 634 


637- 


-642 


644 


647 








649-652 


654-659 


662-665 


669-672 








674- 


-675 


681-682 


685 


688 


690 


696 








698 


700 


-703 707 


709- 


-710 


713 


717 








719-721 


723-724 


728 


731- 


-732 


734 








737- 


-738 


742-745 


74 a 


752 


754 


759 








763 


-766 


768 770 


773 


-777 


780 


782 








784 


786 


791 795 


-798 


801 


-802 


805 








808 


611 


-812 818 


823 


-824 


826 


-327 








832 


834 


-837 839 


843 


846 


848 


-856 








858 


-861 


865 667 


869 


871 


873 


-874 








876 


878 


881-882 


887 


889 


892 


894- 








698 


901 


-902 904 


906-908 


913 


-915 








919 


921 


-924 926 


-932 


934 


-935 


937 








939 


»941 


943 946 


-947 


950 


953 


958 








961 


965 


-967 971 


973 


-975 


977 


-979 








981 


984 


-985 990 


992 


-993 


995 


-997 








999 


1001 1004-1007 1009 


-1011 








1013 1016 


1020 


1023 


1025 1027- 








1031 1033- 


-1035 


1039 


-1042 1044- 








104 


5 1049 


1053 


1055 


-1056 1058- 








1059 1062 


1064- 


1065 


1067-1070 








1072-1074 


1079 


1082 


108 


7 1089 








1093 1097 


1039^ 


1103 


1105-1107 








1109-1114 


1123 


1125 


-1127 1132- 








1134 1140 


1143- 


1145 


1146-1150 








1156 1158 


1160 


1163 


1172-1173 








1177-1178 


1181- 


1184 


1190-1192 








1195-1197 


1199 


1204 


1206 1208 








1211 1214 


1216 


1219 


1227 1230 








123 


4-1235 


1237 


1240 


-1241 1243 








124 


5 1247 


1256 


1258 


1260-1261 








1264 1266 


1270- 


1271 


1275 1278- 








1279 1264-1286 


1288 


-1289 1299- 








1301 1306 


1308 


1312 


1314 1317- 








1319 1323-1325 


1327 


-1330 1334- 








133 


5 1339 


1343- 


1347 


1349-1350 








1354-1355 


1357 


1360 


1362-1363 








1365-1367 


1369 


1372 


1376 1378- 








1360 1386 


1389- 


1391 


1394 1400 








1403 1406 


1409 


1416 


-1419 1422- 








1427 1429 


1435 


1437-1438 1440- 








1442 1446 


1448- 


1450 


1453 1460- 








1461 1468 


1470 


1472 


1474-1475 








1478 1482 


1486 


1490 


-1493 1496 








1498 1500-1504 


1506 


1508-1509 








1511-1512 


1516 


1518-1519 1521 








152 


4-1528 


1531 


1536 


-1S3 


8 1543 








154 


7 1550 


1554 


1556 


1564 1567- 








1569 1580 


1587- 


1588 


1591-1592 
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SEQ ID NOS: 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



1597 
1518 
1641 
1661 
1676 
1691 
1713 
1727 
1744 
1763 
1776 



-1628 
1646- 
-1662 
•167S 
-1692 
•1714 
1730- 
1748- 
• 1764 
1779 



1600 
1630 
1649 
1664 
1683 
1699 
1717 
1733 
1752 
1767 
1783 



•1601 
1631 
1652 
1667- 

-1684 
1702 
1719 
1738 
1758 
1769 

•1786 



1611- 
1635- 
1654- 
1669 
1686- 
1707 
1722 
1740 
1760- 
1772- 



1612 

1638 

1659 

1674 

1688 

1711 

1726- 

1743- 

1761 

1773 



fetal liver- 
spleen 



Columbia 
University 



FLS003 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434* 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



15-16 26 34 58 61 64 70 75 78 B9 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 39S 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 037 057 861 872- 
B73 875 881 889 B94-89S 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 1086-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 12S7-12S8 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 13.62- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1544 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 



Invitrogeii 



FLV001 



fatal .liver 



Clontech 



FLV002 



676 998 1719 



fetal liver 



Clontech 



FLvv04 



93 133 214 301 355 374 379 555 
581 601 679 837 B47 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 
1733 1760-1761 



fetal muscle 



Invitrogen 



FMS001 



26 37-39 50-51 58 84 86 89 98 

113 128 131-132 139 155 172 186 

194 198 201 206 211 230-231 256 

261 276 282 286 302 325 359 361 
376 379 383 398 412-413 419 430 

435 448 452 462-463 473 477 503 

519 529 561 569-570 590-591 597 

607 623 626 635 647 660 672 715 

725-726 730 733 761 775-777 788 

826 837 860 874 913 915 921 935 

970 980 986 968-990 992 1000- 

1001 1007 1014 1027 1035-1036 

1045 1060 1064 1070 1083 1097 
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Tissue Origin 


RNA Source 


Hyseq 






SEO 


ID KOS ; 








Library Kame 




















1099 


-1102 


lil6- 


1117 


1121 


1164 








1173 


1196 


1208 


1228 


1240 


1258 








1266 


1270 


1277 


1298 


1317 


-1320 








1324 


-1325 


1329 


1336-1337 


1369 








1383 


-1384 


1399- 


1400 


1403 


1409 








1433 


1505 


1514 


1542 


1551 


1554 








•t act 


-1559 


1562 


1589 


1599 


1620 








1632 


1644 


1650 


1652 


1671 


1675 








1712 


1725-1726 


1743-1744 


1754 








1766 












fetal muscle 


Invitrogen 


FMS002 


119 


221 273 402 


426-427 


463 547 






599 


736 869 1000 1033 10 


83 1266 








1431 


1440- 


1441 


1468 


1545 


1599 








1673 


1678-1679 


1687-1688 


1710 








1712 


-1714 


1723 


1725 


1731 


-1733 








1743 


-1744 


1760- 


1761 


1767 




fetal sXia 


Invitrogen 


FSK001 


1 4- 


11 15-16 20 


-23 25 29 


33 40 






43 46 56-57 60- 


61 64-66 


75 82 87 








97-98 105 


107-108 113 lie-119 








123 


133 135-137 


13 9 


144 


146 148 








151- 


153 156 163 


170 


176 


180 188- 








189 


197-198 200 


202-203 


210 218 








222 


231 246-247 


261 


263 


265-270 








277 


265-286 290 


293 


299 


301 307 








311 


321 325 328 


330 


333- 


335 339 








341 


345 351-352 


355- 


•356 


358-359 








362 


368 370 372 


376 


379- 


382 384 








388 


394 404-405 


4C8- 


•409 


411-412 








419- 


420 424 426 


-427 


436 


441-442 








445 


448-449 454 


462 


465- 


466 472 








476 


490 493 504 


506 


509 


515-517 








519 


526 531 537 


-540 


547 


549 560- 








561 


567 572-573 


581 


584 


589 611- 








612 


615 623 630 


-631 


635 


647 649 








651 


657-658 660 


662-66S 


667 669 








672 


676 678 681 


688 


701 


704-705 








709- 


710 713 717 


720- 


-721 


725-726 








728- 


729 732 748 


750 


753 


759 764 








766 


770 775-777 


780- 


-781 


786 788- 








789 


79B 809 811 


814 


816- 


817 822 








824- 


826 831 842 


357 


859 


861 863- 








864 


881 894-89S 


908 


910- 


911 916 








918 


922-923 928 


932-933 


935 937 






1 


946 


948-949 953 


960-961 


966-967 








970 


975 977 986 


990 


992- 


993 999- 








1000 


1004 


1007 


1013 


1018 


1025 








1027 


1032 


1035 


1041-1043 


1054 








1057 


-1058 


1060 


1062-1064 


1069 








1072 


1077 


1090- 


1091 


1097 


1099- 








1103 


1108 


1113 


1119 


1123 


1128 








1131 


1134 


1140 


1148- 


•1149 


1152- 








1153 


1156 


1163 


1167 


1178 


1182 








11B9 


1192 


1195- 


119G 


1198 


1201- 








1205 


1208 


1211- 


1212 


1216 


1219- 








1220 


1222 


1225 


1240 


1243 


1258 








1266 


-1267 


1274 


1277 


1280 


1282- 








1285 


1299 


1310 


1317-1322 


1324- 








1325 


1329-1330 


1342 


1344 


1346 








1349 


-1351 


1354- 


1357 


1365 


-1366 








1369 


1371 


1373 


1375 


1378 


1380 








1383 


-1384 


1387 


1399-1400 


1405 








1410 


1427 


1429 


1431 


1433 


-1435 








1439 


-1441 


1448- 


1449 


1454 


1457 








1468 


1470 


1472 


1475 


1480 


-1481 








1487 


1490- 


1491 


1493 


1498 


1509 








1512 


1S21 


1525- 


1526 


1529 


1535- 








1536 


1547 


1549 


1557-1S59 


1588 








1592 


1595 


1597- 


1598 


1601 


1603- 








1604 


1608 


1611 


1614 


1618 


1624- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ XD NOS: 



1626 
1644 
166 5 
1702- 
1724 
1742 
1765 
1786 



1632 
1646 
1668 
1703 
1727 
1747 
1772 



-T63T 
1654- 
1675 
1709- 
1731- 
174 9 
1776- 



1636 
1657 
1685 
1710 
1732 
1755 
1777 



1641 
1660- 
1687- 
1716 
1737- 
1760- 
1779- 



1643- 

1662 

1689 

1719 

1740 

1761 

1780 



fetal skin 



Invitrogen 



FSK002 



fetal spleen 
umbilical corcf 



13 286 302 307 313 321 330 335 
339 341 354 370 372 385 400 402 
408 414 426-427 433 4.36 450 454 
515 S44 585 598 767 810 845 939 
1076 1109 1155 1317-1320 1326 
1333-1335 1343 1347 13S0 1369- 
137i 1377-1378 1391 1397 1422 
1466 1647 1656 1678-1679 1687- 
1688 1693 1718 1721 1725 1731- 
1732 1739 1755 



FSP001 



110 137 211 353 589 927 1108 
1639 1771 



BioCha in 



BioChain 



FUC001 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 85 101 104 113- 
114 116 119 122-124 133 Z37 153- 
154 157 161 163 166-167 175 181- 
184 186 192 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 2B4 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 39C 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 561 574-577 588- 
591 593 606 615 620-621 632 637 
645-647 650 659-660 662-664 667- 
668 674-67S 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 848 858 
861 864 875 879 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 1089 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



fetal brain 



GIBCO 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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Tissue Origin 



SEQ ID NOS: 



RNA Source 



Hyseq 
[library Name 



72 75 77 80 82 85 90-91 94 100- 
102 107 110 112-116 118-119 122- 
123 126 128 134 136-140 147-148 
153-155 157 161 165 169-172 175 
181 186 188-189 197-198 204-206 
208 210 215 222-223 225-226 230 
235-238 240-241 247 253 256-258 
260-262 267-269 276 279-291 284 
286 289 298 300-302 307 310 318 
321-323 325 330-331 339 341 346- 

349 352 354 356-359 362 364-365 
371-372 377 379-380 382 384 387 

350 400 408 414-416 419 424 431 
434-435 438 441-443 449 451 453- 
455 457-463 470 472-473 475 477- 
478 482-483 486-488 490-491 493 
496 499-500 502-504 506-507 509- 
512 515 519-520 522 S25-526 529- 
530 537-540 543-544 546-547 566- 
567 569-570 572-582 585 588 590- 
591 593 595 599 601 504 606-609 
611-612 614-620 622-624 630 632 
636 643 645-647 650-652 654 659 
661 665 667-668 670-672 676 678 
601 687 689 692-694 697 699 710 
714 717 721 727 729-732 734 736 
738 743-746 750-751 759 763 766 
770 772 775-777 784 789 791 796 
799 802-B05 810-811 814 819-821 
824 826 830 834-837 839-850 854- 
856 858-860 862 864 869 871 876- 
877 879 883 886-887 890-091 893- 
895 898-901 905 908-910 912-916 
919 922-923 925 927 930-933 935- 
938 948 952-960 963-964 967 969- 
972 97S 978-979 981 983 986-9B7 
990 992 995 997 999-1002 1005- 
1009 1011-1013 1016 1018-1019 
1023 1026 1029-1031 1033-1035 
1038 1041 1047 1050 1053 1057 
1059 1064 1068 1070 1072-1073 
1078-1079 1081-1082 1086 1089 
1094 1097 1103 1107-1109 1113- 
1115 1121-1122 1127 1134-1135 
1138 1140 1143 1148-1151 1153 
1156-1157 1159 1167 1170 1175 
1193-1194 1200 1202 1207-1209 
1211 1216 1219-1220 1226-1227 
1229 1232-1234 1240-1241 1243 
1246 1249-12S1 1253-1254*1258 
1267-1268 1271 1276 1279 1282 
1285-1289 1293-1294 1305 1307- 
1308 1312 1316 1320 1327 1338- 
1339 1341-1344 1346 1349 1355- 
1357 1359 1365-1366 1369-1370 
1373-1375 1379 1386 1389 1394 
1398 1409 1413-1414 1416-1417 
1420-1421 1425-1427 1430 1433 
1437 1439 1442 1445-1452 1454- 
1457 1459 1463-1464 1468 1470 
1474 1477-1479 1489 1492 1494 
1497-1498 1501-1503 1507 1509 
1511-1513 1517 1520-1521 1524- 
1526 1531-1533 1535 1537-1538 
1547 1554 1556-1559 1564-1567 
1571 1584 1587 1589 1594 1599- 
1601 1611-1612 1614-1616 1619- 
1620 1625-1628 1630-1631 1634 
1637-1638 1640-1643 1645 1648- 
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Tissue Origin 



macrophage 
infant brain 



RNA Source 



Hyoeq 
Library Name 



SEQ ID NOS7 



1649 
1664- 
1679 
1704- 
1720 
1737- 
1755 
1779 



1651 

1665 

1683- 

1705 

1724 

1738 

17S7 

1785 



1653 
1667 
1684 
1709 
1727 
1743 
1760 



-1655 
1669 
1686 
1713 
•1726 
•1744 
•1761 



1657- 
1673 
1693 
1714 
1731- 
1752 
1765 



16S8 

1678- 

1701 

1717- 

1733 

17S4- 

1772 



Invitrogen 



HMP001 



Columbia 
University 



5-8 110 204-205 503 634 678 859 
878933 988-989 1379 1448 1504 



IB20G2 | 10 12-13 15-18 22-23 
37-39 43 47 50-51 54 
65-66 68-69 72-74 80 
88-92 97 100 102-104 
112-113 115-116 118 
134-136 138-139 143 
152 154-155 163 165- 
175 181-184 186 193- 
203-205 209-210 214- 
226 231-232 235-236 
252 257 260 268-269 
279-281 286 288 291- 
300-301 304 307 310 
330-331 333-334 339 
352 356-357 362 371 
380 383-3B4 392 397 
411 413-414 416 428- 
430-431 434-435 438 
454 461 464-466 469 
475-476 478 482-483 
494 497 503 507-508 
519-520 524-526 530 
547 550-551 561 563- 
572-576 579 581-582 
591 593 595-597 607 
616-617 620 622- $24 
641 645-647 650-655 
665 667-675 689 691 
703 707 713-715 717 
733-736 739 743 745 
763 769-770 772 778 
788-789 793-794 799 
814 825-826 830 834- 
845 848-850 854-855 
865 870 872 875-876 
890-891 894-896 898 
917 919 922-925 927- 
934-936 938 941 9-15- 
953-954 959-962 966- 
981 986-990 992 997 
1004-1006 1014 1016 
1024-1025 1033 1036 
1052 1054-1055 1057- 
1064 1068-1070 1073 
1085 1089 1108-1113 
1123-1124 1130 1132- 
1149 1151 1153-1154 
1172 1174-1175 1183 
1190 1193-1194 1196- 
1204 1208-1209 1211 
1226-1227 1229 1231 
1247 1249 1251 1256 
1262 1269 1274 1279 
1285 1287-1289 1294 
1307 1333-1314 1316- 
1332 1341-1342 1345 
1362-1363 1365-2366 
1374 1381 1383-1384 
1403 1406-1407 1413 



25 29 34 
56 58 50-63 
92-83 86 
106-108 110 
123 12B 130 
147-149 151- 
167 169 172- 
196 198 202 
215 222 224- 
239 246-247 
272 276-277 
292 295 298 
313 321-323 
346-347 349 
372 377 379- 
401 406 408 
419 422 428 
443 449 453- 
470 472-473 
487 490 492 
510-513 516 
534 536-540 
564 566-567 
584-587 590- 
•609 611-613 
627 632 637 
657-658 660- 
695 697 699 
721 728-731 
751 755 759 
780-781 785 
803 BOB 811 
836 840-843 
860 862 864- 
878 886 888 
903-904 916- 
928 930-932 
946 948-950 
969 977 979 
999-1000 
1018-1019 
1047 1051- 
1059 1063- 
1081-1082 
1118-1120 
1138 1140 
1163-1170 
1184 1188 
1197 1199 
1218-1222 
1234 1241 
1258 1261- 
1281 1283 
1295 1305 
1320 1329 
1349 1356 
1368-1370 
1388 1400 
1417 1420 



130 



WO01/53312 



PCT/US00/34263 



Tissue Origin I rna Source 



Hyseq 
library Name 



SEQ ID NOS: 



1423 
1441 
14S4 
1468 
1483 
1499 
i 1522 
1542 
1555 
1580 
1593 
1610 
1624 
1639- 
1654- 
1672- 
1693- 
1717- 
1733 
1755- 
1777- 



1429 
1443 
-1455 
1470 
1495 
1502- 
-1523 
1546- 
1563 
15B3- 
1S95 
1612 
1626- 
•1640 
1655 
1673 
1695 
1720 
1735- 
1758 
1778 



•1431 
1447 
1457 
-1471 
1493 
-1503 
1525 
-1547 
1565- 
-1586 
1598 
1614- 
-1627 
1642 
1658- 
1676- 
1701- 
1723 
1741 
1762 
1786 



1435 
1449 
14S9 
1475 
-1494 
1505- 
1526 
1549 
-1567 
1588 
1600- 
-1616 
1630- 
1644 
•1659 
•1681 
1702 
1724 
1743- 
1765 



-1436 
1451 
1463 
1479 
1496 

-1507 
1531^ 

-1550 
1569 
1590 

-1601 
1619 

■1633 
1647 
1654- 
1685- 
1704 
1726- 
1744 

1771 



1439- 

-1452 

-1465 
1482- 
1490- 
1509 
1533 
1554- 
1575 
1592- 
1608- 
1621 
1637 
1652 

1665 

1688 

1708 

1728 

1752 

1774 



Columbia 
University 



IB2003 



infant brain - 



infant brain 



Columbia 
University 



IBM002 



Columbia 
Cftiiversity 



IBS001 



17-18 20-23 29 34 43 SO €8-6"9 
78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
276-281 286 290-292 295-300-301 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 S07 516 
520 530 534 536-540 551 S63 572- 
576 585 587 590-591 593 59S-596 
601 606 612 616-617 620 622-624 
650 652-653 661 66S 670-671 674- 
675 678 689 715 717 727-728 730 
734 759 775-777 780-7B1 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
2288-1289 1305 1314 1327 1333 
1344 1347 1350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1535 1546 15S7- 
1559 1S67 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1676-1681 
1683-1684 1701-2702 1708-2709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 

101 113 139 152 260 279 290-292 " 
374 377 551 563 608-609 653 659 
314 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 2397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 

10 12 119 175 279-241 321 334 



371 446 551 563 623 652 667 669 
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SBQ ID NOS: 



Tissue Origin 



RNA Source 



Hyscq 
Library Name 



671-672 819 949 966 
1151 118B 1193-1194 
1253 1265 1271 1207 
1324-1325 1342 1423 
1448 1471 1482 1525 
1562 1569 1588 1591 
1647 1649 1658 



1113 113 0 
1196 1229 
1317-1319 
1440-1441 
1532 1546 
1610 1618 



5-9 17 20-21 25 
153 157 197-198 
213 223 262 266 
333 356 370 427 
472 493 498 503 
537-540 542-544 
599-600 607 615 
692-694 712 719 
794-796 810 837 
856 869 876 903 
964 975-976 984 
1024-1025 1033 
1070 1072 1082 
1136-1138 1140 
1233 1246 1279 
1320 1334-1335 
1446 1478 1482 
1552 1555 1567 
1620 1625 1632 
1655 1662 1680- 
1690 1696 1702 
1760-1761 1778 



lung, 
fibroblast 



Strategene 



LFB001 



68-69 82 94 105 
203 207-208 212- 
233 302 321 326 
430 436 446 462 
516 519 527 535 
562 565 567 586 
630 647 662-664 
745 748 775-777 
843-847 849 854- 
934 953 955-956 
1000 1005-1007 
1039 1053 1064 
1112-1113 1134 
1195 1223 1232- 
1285 1295 1311 
1343 1427-1428 
1493 1504 1537 
1S75 1582 1598 
1638 1645 1654- 
1681 1684 1686 
1711 1733 1741 
1785 



lung tumor 



Invitrogen 



LGT002 



5-10 18 20-21 29 33-36 40 43 52 
54-S5 61 65-66 68-70 73-75 80 85 
88-89 93-94 100 103 106-108 112- 
113 115-116 118-113 123-124 126 
130-132 135-137 139-141 143-144 
147-148 151-153 155-156 159 161 
164 169 171 179-180 185 190 192 
194 196-199 203-208 210 212-214 
216-217 219 222 233 240-241 244 
246 251-252 255-256 261-262 266. 
272 276-277 279-281 284 286 288 
290 295 298 301-302 309-312 317 
321 329 332 341-342 344-345 348 
352 358-360 363 368 370-371 376 
380-381 384 389-390 398 400 409 
414 423 426-427 430 432-436 443- 
444 450-451 454 462 468 472-477 
480-483 487-488 490-491 493 496- 
498 500 503-S06 509-512 515-516 
519 521-523 526 530 534 541 544 
547 554 557 564 566-567 572-S76 
585-586 588-589 595-596 601 607 
611-612 615 619 621 623 626 630 
632-633 644 647 649 651 655-656 
660 662-665 667 659 672 683-684 
696 700 706 710 713 716 718-719 
722-723 728 734-739 743 750 752 
763 765-766 773-778 784-7B5 787- 
789 791 800 802-803 809-812 814 
824 826 828-829 832 838-839 841- 
845 849-850 852-855 857-861 864 
866 874 878-880 882 867 890-891 
897-898 902 904 906-907 9Z0 916 
918-920 922 924-925 927 930-932 
934-935 937 947 950 953 9S5-956 
961 963 966-967 969 971 977-979 
981 984 986-987 990 992-993 995 
997 999-1001 1005-1007 1009 
1012-1013 1018 1020 1022-1024 
1026 1029-1030 1033 1038 1041 
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xissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



lymphocytes 



atcc 



LPC001 



"104T 
1059 
1074 
1097 
1116 
1139 
1152 
1172 
1202 
1222 
1257 
1278 
1289 
1317 
1344 
1357 
1383 
1403 
1431 
1448 
1470 
1488 
1508- 
1519 
1540 
1561 
1591 
1602 
1624- 
1644- 
1656- 
1671 
1685- 
1705 
1730 
1748- 
1767 
1778- 



1047 
1063 
1078 
1104 
1117 
1141 
-1153 
1178 
1204 
1227 
-1258 
1280- 
1295 
-1321 
•1346 
1365 
•1385 
1408 
1433- 
1454- 
1474 
1490- 
1509 
1523- 
1546 
1555 
1593- 
1608 
1625 
1645 
1662 
1673- 
1688 
1709 
1735 
1749 
1770- 
1779 



=oo5o 

-1064 
1085 
1106 
1119 
-1142 
11S6 
1195 
1208 
1234 
1265 
-1281 
1300 
1329 
1349 
-1366 
1394 
1417 
-1436 
•1455 
1480 
1491 
1511 
1524 
1549- 
1567 
1594 
1614- 
1627- 
1647- 
1664 
1675 
1690- 
1716- 
1739 
1753 
1771 
1786 



1052 
1067 
1087 
1107 
1126 
1144 
1158 
-1196 
1214 
1241 
1267 
1283 
1305 
1338 
1351 
1369 
1397 
1419 
1438 
1460 
1461 
1494- 
1512 
1528< 
15S0 
1569 
1596- 
1616 
1632 
1649 
1666- 
1678- 
1692 
1717 
1741 
1760- 
1773 



1054 
-1071 
1089 
1109 
1134 
-1145 
1167 
1198 
1216 
1247 
•1270 
1285 
1308 
■1339 
1353 
1378 
1400 
1423 
1444 
1466 
1483 
1496 
1515- 
1529 
1555 
1575 
1598 
1618 
1636 
1652- 
1667 
1679 
1696- 
1722 
1743- 
1762 
1775 



-1055 
1073- 
1095- 
1112 
1135 
1148 
1170 
-1200 
1219 
1252 
1276 
12B8- 
1312 
1341 
1355 
1379 
1402- 
1426 
1446- 
1468 
1486- 
1506 
1516 
1536- 
1560- 
1588 
1600- 
1620 
1639 
1653 
1670- 
1683 
1699 
1727 
1744 
1765 
1776 



4 11-12 18 24-25 30-31 48 50-51 
56-57 68-69 80 92 98 103 105 110 
126 137 152-153 157 165 172 188- 
189 197 203 210 217-218 222-223 
225-226 229 231 247 251 256 264 
272 280-281 284 300-301 321 325- 
326 339 348 352 357 371 382 384 
390 400 404 412 414 421 423 426- 
427 430-431 445 447-448 451 454- 
455 475 503 516 526-527 530 537- 
540 549 556-560 553 S74 577 5B9 
602 613 615-617 621 623 628-630 
636-637 647 649 657-659 690 697 
717 723 755 764 775-777 780 786 
789-790 793 800 802 822 838 849 
866 869 876 881-883 892 898 906- 
907 911 921-923 928 975 990 992 
996 1001 1004-1007 1033 1050 
1054 1078 1107 1135 1140-1141 
1143 1148 1158 1163 1177 1199 
1205 1216 1226 1231 1236 1241 
1244 1250 1258 1260 1265 1269- 
1271 1290-1293 1308 1312 1317 
1319-1320 1339 1345-1346 1348 
1350-1351 1357 1367 1369 1379 
1381 1383-1384 1386-1387 1389 
1394 1397 1405 1423 1425-1428 
1431 1437 1446 1448 1461 1466 
1470 1472 1474 1482 1492 1506 
1526 1537 1546 1549 1591 1598 
1600 1603-1604 1606 1627 1636 



133 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin 


rna source 


Hyseq 






SEQ ID NOS: 










Library Name 






















1^38 


1647-1649 1651 


1658 


-1659 








1664 


1676-1677 1680- 


1681 


1687- 








1688 


1699 1711 1715- 


1716 


1726 








1728 


1737 1740 1746 


1748 


1752 








1756 


1758 1777 1779 








leukocyte 


GIBCO 


lAJCOOl 


3-4 


10-11 13 15- 


.18 20-21 


24- 


25* 






30-31 35 


-36 40 43-45 


48 


50-51 








54-58 60 


-63 68-69 75 


79- 


80 82-83 








85 88-91 


93-96 98 100 103-104 








107- 


108 


112 116 


119 


123 


125- 


128 








134- 


140 


142 147- 


-149 


151 


153 


155 








157 


162- 


163 167 


169- 


172 


174 


177- 








179 


186 


190 192-199 


203- 


207 


210 








212- 


215 


217-219 


222- 


223 


229 


235- 








236 


247 


251 255-258 


260 


262 


272 








274- 


277 


280-281 


285- 


285 


297-301 








307- 


310 


313-314 


316- 


317 


321 


325- 








330 


333- 


334 340 


-342 


348- 


349 


352 








354- 


358 


370-371 


3B0-385 


387-388 








400 


405 


408-410 


412 


414- 


•416 


421- 








425 


430-431 434 


-435 


437 


439 


441- 








442 


445- 


•451 453 


-454 


456 


459 


461- 








464 


468- 


472 474 


-479 


481 


483- 


•4 85 








487- 


491 


496 499 


-501 


503- 


•504 


509- 








513 


516- 


-519 S22 


526- 


527 


529- 


•531 








534 


536- 


-540 542 


547- 


•549 


553- 


-559 








566- 


567 


571 574 


-577 


579 


562 


584- 








586 


589 


593 595 


-597 


601-602 


604 








606- 


-607 


611-613 


615-621 


623 


627- 








629 


633 


636-637 


642 


644- 


-650 


6S5 








659- 


-660 


662-665 


667 


669 


674 


-675 








678 


682-684 692 


-696 


698 


700 


706 








708 


710 


716-720 


725- 


•726 


729-736 








738-739 


743-746 


749 


751 


753 


756 








759 


765 


-766 768 


770-773 


780 


784- 








79£ 


788-790 793 


796 


793 


800 


802- 








803 


810 


-811 814 


817 


819 


826 


828- 








830 


832 


834-836 


838 


843 


845 


-860 








863-864 


866-871 


877-879 


881 


-892 








894- 


-896 


898 902 


904-914 


916 


919- 








925 


927 


930-932 


935 


-936 


941 


-942 








945 


948 


-949 953 


955-956 


958 


960- ! 








962 


964 


967 970 


-971 


973 


975 


977 








985-990 


992-993 


995-996 


999 


-1002 








1004-1009 1C11 


1014 


1017-1019 








1022-1023 1025 


1027 


1029-1031 








1033-10 


36 1038 


1041 


1043 1047 








1050 10 


53-1054 


1058 


-1059 1061- 








1062 1064 1068 


1070 


1072 1078 








1085-1086 10B9- 


1091 


1093 1097 








1106-11 


07 1110- 


1113 


1115-1117 








1122-1123 1125 


1129 


1132-1133 








1135-1137 1140- 


1145 


11S2 1158 








1163 1168 1170- 


1174 


1176-1178 








1180 1182-1183 


1186 


1195 1198- 








1200 1202 1205- 


1206 


1211 1216 








1219-1221 1223- 


1227 


1230-1236 








123 


3-1242 1247 


1252 


1254 1256 








1258 1261-1262 


1264 


-1265 1269- 








1270 1272-127S 


1277 


1280-1284 








1287-1293 1299- 


1300 


1306 13 


08 








1312-13 


13 1317- 


1320 


1322 1324- 








1330 1333-1335 


1339 


134 


i 1343- 








134 


7 13 


49 1353- 


1357 


1359-1361 








136S-1367 1369- 


1370 


1373-1374 








1377 1379-1381 


13B6 


-1387 13 


94 








1400 14 


03 1409 


1419 


1423 1425- 








1428 1430-1431 


1433 


-1434 1437- 








143 


3 1440-1442 


1446 


-1448 1450 
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Tissue Origin 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



1453 

1470- 

1463 

1506 

1521- 

1531 

1549- 

1565 

1594 

1608 

1626 

1639 

1653 

1670 

1692 

1711 

1727 

1744 

1762 

17B4 



1458- 
1471 
1490- 
1S09 
1522 
1534 
1550 
1567 
1596 
1611 
-1629 
1641 
•1655 
1675- 
1696 
1716- 
1733 
1748. 
1765 
1786 



1459 
14 74 
14 93 
1512- 
1524- 
1538 
1553 
1575 
1598 
1614 
163?.- 
1644 
1658 
1679 
1700 
•1717 
1737 
•1749 
1769 



1463- 
1477- 
1496- 
1513 
1525 
1541 
1555- 
1580 
1600- 
1620- 
•1632 
•1645 
•1660 
1684 
1702 
1720 
•1738 
1752 
1771 



1464 
1478 
1501 
1S16 
1527- 
1545- 
1556 
1589 
■1602 
•1621 
1636 
1648- 
1652 
•1688 
1707- 
1723 
1741 
1755 
-1772 



1468 

1482- 

1504 

1519 

1528 

1S47 

1560 

1591 

1606- 

1624 

1638- 

1650 

1669- 

1690- 

1709 

1725- 

1743- 

1760- 

1781- 



leukocyte 



Cloiltech 



LOCO 03 



4 35-3£ 44-45 61 68- 
119 139 154 179 197 
324 372 404 430-431 
477 481 503 537-540 
581 589 608-609 621- 
632 647 662-664 669 
773 775-777 802 848 
879 905-907 915 949 
1002 1113 1119 1170 
1236-1237 1241 1275 
1357 1359 1377 1506 
1553 1591 1600 1613- 
1628 1670 1676-1677 
1699 1733 1738 1772 



69 75 82 102 
244 280-281 
455 461 476- 
554 575-576 
622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



25 35-36 43 80 104 126 128 150 
163 166 188-189 197 210 215 220 
271 277 280-281 310 317 336-338 
345 3S1 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
461 490 4 99 503 526 528 546 548 
567 575-576 588 601 613 615 647 
660 665 734-735 737 759 778 787 
790 800 832 845 856 859 869 878 
883 887 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 10D8 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278*1230 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1600-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 
1761 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MEL004 



18 20-21 24 
52 55-58 60- 
80 82 89 98 
123 128 133- 
152 154 158- 
174 176 178 
196 201-206 
228 231 233- 
256 261-263 
279-281 284- 



mammary gland 



Invitrogen 



MKG001 



5-8 10 12 14 
33-39 42-43 
71 73-74 79- 
106 108 112 
146 148 150 
166 170-172 
188-190 194- 
222 224 227 
251 253-254 
271 276-277 



25 29 
64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-257 
286 286 
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Tissue Origin 



RNA Source 



Hyseq 
L ibra ry Name 



SEQ ID NOS: 



290 297 299 301 304 
320-321 323-325 327- 
334 339 341 344-345 
359-360 362-363 368 
303 380 390 393-395 
4C8 412 414-415 423 
441-444 448 451-455 
476 479 482 485-486 
495 498 503 506 509- 
519-520 522 527 529 
.547 549 554 557 562 
589-S91 537 602 607 
629 632 634-640 644 
652 655 657-658 660 
672 674-676 679 682 
706-707 710 713 717 
732-734 736 738 743 
755 759 761 766 770 
789 794 003 806-807 
822 827-829 837 842 
•864 866 869-870 872 
893-900 904 906-907 
921-923 926 935-937 
953-954 957 960-961 
970 977-978 984-989 
1000-1001 1005-10C6 
1014 1016-1017 1023 
1032-1033 1036 1039 
1055 1057-1058 1063 
1077-1078 10B5 1087 
1095-1102 1107-1108 
1121-1123 1131-1133 
1139-1142 1144-1145 
1153 1159 1167 1170 
1183-1185 1190-1192 
1207-1206 1212 1216 
1223 1225 1231 1234 
1247 1253-1254 1258 
1262 1270-1280 1283 
1298 1307 1314 1316 
1325 1330 1334-1335 
1349-1352 1354-1355 
1370 1377 1373 1381 
1389 1405 1414 1419 
1425-1426 1428-1429 
1437 1439 1448-1449 
1460-1464 1466 1471 
1487 1489-1491 1493 
1512 1513 1526-1528 
1536 1539 1542 1547 
1554 1561-1562 1564 
1576-1579 1581-1532 
1592 1594 1596-1597 
1607-1608 1610 1612 
1621-1622 1625-1626 
1636 1641 1643-1644 
1652 1654-1655 1657 
1662 1664-1666 1669 
1674 1676-1677 1680 
1692 1701 1706 1713 
1720 1723-1728 1730 
1740 1742-1744 1746 
1751 1753 1760-1762 
1771 1774 1776-1777 
1784 1786 



309-312 318 
323 331-332 
348 350 356 
371 376 379- 
397-398 405 
430 434-437 
462-464 474 
488 490 494- 
512 516-517 
534 537-541 
572-574 587 
618 623 628- 
617-648 650- 
665 667 669- 
688 695-696 
720 722-730 
747-748 750 
7B0 784 786- 
809 814 817- 
854-858 863- 
878 881 889 
911 916 919 
946 948-949 
963 965-966 
993-997 
1008 1013- 
1025 1027 
1043 1045 
1068-1075 
1089-1091 
1112-1119 
1136-1137 
1148-1149 
1172-1173 
1196-1199 
•1218 1222- 
1240-1241 
1259 1261- 
1285-1286 
•1320 1323- 
1342-1345 
1359 1369- 
1393-1384 
1421-1423 
1431 1434- 
1454 1457 
1480-1483 
1505 1507 
1532 1534 
1S49-1550 
1567 1572 
1587-1588 
1601-1602 
-1616 1618 
1631 1635- 
1647 1650 
1658 1660 
-1671 1673- . 

1685 1689- 
-1715 1719- 
-1732 1738 
-1747 1749 
1765-1768 
1779 1783- 



induced neuron 
cells 



Strategene 



NTD001 



29 35-36 80 116 123 
214 230 280-281 284- 
330 340 358 371 375 
422 424 492 497 532- 



156 163 181 
285 307 321 
377 380 382 
533 542 546 
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Tissue Origin | RNA Source 



Hyseq " 
LibraryNaine 



SEQ ID NOS: 



549 566 58 6 S9g 5T2 645-647 654 
734 775-778 7B0 752 759 821 826 
856 858 875 936 953 985 990 992 
1041-1043 1055 1072 1104 1193- 
1194 12DG 1223 1246 1253 1274 
1288-1289 1291 1294 1311 1320 
1345 1359 1412 1423 1485 1620 
1623 1645 I6B4 1705 1715 1751 



retinoid acid 

induced 
neuronal cells 



Strategene 



NTR0 01 



neuronal cells 



Strategene 



NTC7001 



5-B 78 268-269 277 383 431 506 
623 677 731 999-1000 1195 142S 
1426 1547 



29 65-66 80 82 110 119 146 157" 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 488 503 506 510-512 519 537- 
540 572-574 S97 602 607 623 647 
661 700 702 715 743 771 792 858 
904 948 954 977 1000 1005-1006 
1025 1064 1069 1122 1148 1185 
1219 1226 1234 1246 1271 1283 
1295-1296 1311 1317-1320 1329- 
1330 1350 13S5 1365-136S 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



pituitary 
gland 



Clontech 



PIT004 



311 314 379 408 419 430 454 10ST" 
1095-1096 1272-1273 1312 1320 
1378 1652 1671 1720 1725 1736 
1741 1755 



Clontech 



PLA003 



prostate 



rectum 



Clontech 



Invitrogen 



PRT001 



REC001 ' 



5-8 124 208 277 370 843 906-907 
1280 1317-1319 1359 1609 1621 
1737 

9 45 S7 71 107 147 171 177 197 — 
201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
654 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
1061 1095-109S 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478*1479 1482 1489 1513 1517 
1S27 1531 153S 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



7-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 589 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 738 748 750 
756 762-763 766 770 774 790 819 
825 843 849 851 881 903 909 948- 
949 960 986 99S 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
1108-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-133S 1350- 
1351 1355 1369 1373 1375 1425- 
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Tissue Origin 


RNA Source 


Hyseq 


SEQ ID NOS: 






Library ^ame 










1426 1436 1439 1469 1474 1477 








1482 1546 1587-1588 1592 1596 








1610 1622 1627 1644 1658 1662 








1665-1666 1669 1675-1677 1749 








1786 


salivary gland 


Clontech 


SAL0Q1 


10 55 97 103 110 140 149 152 158 








198 217-218 242-243 256 301 308 








312 321 333 3S1 354 360 410 437 








448 473 487 494 496 501 535 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








809 826 848 865 879 906-907 925 








933 963 1016 1020 1025 1040 1046 








1055 1066 1103 1150 1172 1181 








1234 1281-12B2 128B-12Q9 1298 








1315 1320 1333 1336-1337 1346 








1359 1373 1379 1424 1447 1449 








1474 1482 1492 1494 1498 1511 








152J-15Z4 1537 1554 1596 lo/.o- 








1627 1636 1652-1655 1658 1665 








1671-16/2 1691-1692 


salivary gland 


C Ion tech 


SALs03 


158 326 1423 1463-1464 


skin 


ATCC 


SPB001 


1320 1400 


fibroblast 








skin 


ATCC 


SFB002 


262 736 1025 1253 


f ibroblas t 








skin 


ATCC 


SFB003 


709 1119 1350 1631 1653 


fibroblast 








small 


Clontech 


SIN001 


25 142 146-147 151 155 198 203 


intestine 






244 260 271 280-281 286 2BB 298 








301-302 308 312 334 340 371 398 








408 412 414 416 423 426-427 430 








434-435 445 452 4S4 478 503 516 








519 521 523 543 547 549 555 559 








563 569-570 585 592 604 611 626 








628-629 632 650 659 681 710 714 








718 750 764 780 798 829 842 857 








859 866 887 892 894-895 901 904 








906-907 912 919 935 997-998 1000 








1007-1008 1026-1028 1044 1055 








1089 1097 1116-1117 1131 1148 








1169 1199 1219 1234 1247 1264 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 








1403 1407 1423 1428 1468 1498 








1501 1521 1550 1556 1S85 1597 








1636 1638-1639 1645 1653 1656 








1662 1671 1675 1684 1691-1692 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 


sKexatax 


Clontech 


SKM001 


18 20-21 82 84 101 118 134 148 


muscle 






151 153 166 225-226 258 274 277 








299 329 361 412 414 4^0 440 452 








459 470 488 503-504 537-540 647 








C£f\ Cfk CTC 71C: IfX *7Art IGtZ 
bow — 1 X.Z3 Il3 /aw fob oJu 








905 922 950 963 982 990 992 1020 








!Od7 irtfCI 1115-1117 1191 1 1 








1228 1268 1284 1298 1321 1329 








1336-1337 1343 1409 1413-1414 








1509 1S99 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


168 1683 1712 


muscle 








skeletal 


Clontech 


SKMs03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKM304 


235-236 


muscle 








spinal cord 


Clontech 


SPC001 


4 9 11 17 30-31 35-36 43 4$ 60 
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Ti3auft Origin 



RNA Source 



adult spleen 



Hyseq 
Library Name 



SEQ ID NOS: 



82 8S 92 94 108 110 
167 198 204-205 210 
259 277 280-281 300 
317 372 379 387 392 
430 433 448 467 473 
509 513 519 S24 526 
547 54.9 551 559 567 
607 616-617 623 625 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
855 858 861 864 871 
898 906-908 917 919 
944 970 985 990 992 
1039 1053 1059 1065 
1077 1082 1095 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 137.7 1330 1350 
1356 1359 1368 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1S38 1548- 
1571 157B 159B 1600 
1627 1630 1639 1646 
1686 1695 1740 



1670 
1771 



116 139 157 
215 229 256 
-302 304 315 
419 426-427 
487 489 506 
537-S40 543 
569-570 593 
637 649-650 
673 679 681- 
726-729 734 
782 789 791 
847-849 854- 
-872 875 884 
924 934 942 
■993 998 1013 
1072 1075 
1103 1109 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 
1443 1448 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
17S2 1755 



CI ontech 



SPLcOl 



stomach 



CI ontech 



STO001 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



10 15-15 61 68-69 100 117 149 

197 201 227-228 231 249 273' 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 651 662-664 722 739 
790 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



CI ontech 



THA002 



thymus 



Clontech 



THM001 



9 11 25 85 87 112 137 146 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 458 
477 483 508 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
17S3 



44-45 54 57-58 62-64 79 104 123 

126 134 153 193 212-213 218 242- 

243 258 274 277 279 297 301 307 

327 330 333 342 351 358 371 410 

430 445 465-466 468 471 483 487 

493 503 S06 509 517 526 535 537- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID WOS: 



thymus 



540 546 548 554 5^7 584 58^ 590- 
591 604 612 621 638-640 645-647 
649 656 660 665 670 698 710 720 
72B 735 739 746 759 762 766-767 
775-777 780 784-785 800 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1215 1218-1219 1221-1222 1227 
1271 1277 12B2 1320 1329 1349 
1367 1369 1383-1384 1417 1419 
1423 1425-1127 1448 1477 1488 
1493 1536 1554 1620 1644 1646 
1549 1654-1655 1661-1662 1669- 
1670 1674 1676-1677 1685-1688 
1707 1711 1731-1732 1737 



Clontech 



TtfMcQ2 



5-9 15-21 25 33 35-36 43-45 48 

50-51 54-55 60 75 63 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 23S- 
236 240-241 244 251-252 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 4S4-456 461 
464-467 470 472 474-476 483 4B8 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 S73-67S 678 69B 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 7S7 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 870-871 881 
890-891 898 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1141 1144- 
114S 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 12B0-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 1545 1549 
1S66 1594 1598-1600 1608 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
165B 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 17S8- 
^761 1771-1772 1779 1786 
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Ticaue Origin 
thyroid gland 



RNA Source 



CI on tech 



trachea 



Hyseq 
Library Name 



THRO 01 



3EQ ID NOS: 



4 3-10 20-21 37-39 4 
57 60-61 65-66 71 83 
100 102 104 110 112 
123 127 133 136-137 
153 155-158 163-164 
186 190-192 197 201- 
229 233-237 246-247 
262 265-266 268-269 
284-286 288-289 298 
311 317 321 326 332 
344 348 350 354 358 
371-373 382-383 385 
401 411 414-415 421 
433-436 443-446 450 
458 472-474 476-478 
487-488 490-494 496- 
503-504 506 509-513 
524 526-527 529 535- 
562 564 569-570 575 
595 601-602 604 606 
617 619-623 628-630 
647 649-651 660 662- 
681 690-694 696 698 
727-729 732 734 738 
745 750 759 761 763 
780 7B5 795-796 798 
824 626 828 833 838 
849 857-860 867 874 
881 887-888 890-892 
908 910-911 913-914 
927 929 932-934 937 
948 9S3 957 961 963 
979 981-982 937 990 
1004-1006 1010 1014 
1033 1038-1039 1044 
1052-1054 1055 1058 
1071 1077-1079 1088 
1105-1106 1112-1113 
1124 1126 1128-1129 
1136-1137 1142-1143 
1149-1150 11S6 1161- 
1170-1173 1177-1181 
1197 1200 1204 1208- 
1217 1219 1222 1230 
1235 1241 1245 1247 
1258 1260 1262 1271- 
1286-1289 1299 1306 
1330-1332 1334-1335 
1349 1365-1367 1370- 
1381 1394 1407 1419 
1437 1440-1441 1443 
1454 1459 1461-1462 
1471 1475 1477 1479 
1497-1498 1504-1505 
1522 1524-1526 1528 
1536-1537 1548 1550 
1559 1S62 1567 1578 
1597 1599-1601 1612 
1619-1620 1622 1624- 
1631-1632 1634 1636 
1645 1648 1651 1653- 
1660 1662-1663 1S67 
1675 1678-1681 1683- 
1691-1692 1703 1709- 
1724-1726 1729 1734 
1740 1743-1744 1749 
1761 1770 1777 17B6 



a 56-51-54- 

94-96 98- 
115-117 119 
140 149 152- 
1C8-169 171 
203 219-220 
253 256 258 
277 280-281 
•299 302 309- 
335 341-342 
359 363 368 
394 398 400- 
424 430-431 
452 454-455 
482 484-485 
497 500-501 
516-517 519 
540 547 549 
576 588 594- 
610 612 615- 
634-635 642 
665 668 670 
700 709 721 
740-741 743 
765 770 773 
802 804 823- 
841-845 847 
-875 878 S8C- 
894-895 898 
922-923 926- 
939 941-942 
-964 966 978- 
992 1001 
1020 1024 
1047 1050 
1068 1070- 
1094-1097 
1116-1117 
1131 1134 
1146-1147 
1164 1167 
1190 1192 
-1209 1214 
1232-1233 
1254 1257- 
-1273 1283 
1314 1320 
1342 1345 
-1372 1374 
1428*1436- 
1446-1449 
1468 1470- 
1482 1491 
1507 1513 
1531 1534 
1553 1555- 
1590-1591 
1514 1616 
1626 1628 
1639 1644- 
1656 1658 
1669 1671 
1686 1689 
1711 1717 
1737-1738 
1753 1759- 



Cl on tech 



TRC0D1 



» 29-31 46 48 87 104 107 110 135" 
158 222 262 266 286 301 318 331 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



352 372 377 384 414 424 44S-44f 
454 472 474 491 496 560 S79 588 
593 597 607 612 626 681 702 719 
810 859 666 878 894-895 912 916 
922 932 935 1046 1075 1090 1099- 
1102 1113 1208 1215 1232-1233 
1237 1281 1312 13B5 1387 1405 
1414 1424 1430 1437 1447 1505 
1S69 1579 1586 1600 1641 1653 
1667 1671 1676-1677 1683 1691- 
1692 1711 1717 1726 1772 



uterus 



Clone ech" 



UTR001 



17 19 25 41 46 57-58 61 89 104 ~ 
108 139 152 174 198 200-201 206 
263-265 274 290 387 408 420 438 
446 443 452 473 491 493 499 503 
506 513 519 522 526 530 542-543 
560 601 610 632 659 565 720 751 
773 780 833 845 857 972 877 912 
929 934 937 996 1009-1011 1018 
1050 1075 1107 1124 1170 1219 
1258 1279 1287 1310 1320 1323 
1343-1344 1375 1437 1451-1452 
1478 1481 1498 1519 1521 1536 
1552 1579 1597 1602 1606 1620 
1626-1627 1649 1652 1661 1670 
1719 1722-1723 



TRADOCS: 1416191.1 (%CQN0 1 ! DOC) 
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TABLE 2 



PCT/USOO/34263 



SEU 
ID 

NO; 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PR01114 protein 
sequence . 


1398 


100 


2 


Y666S6 


Homo 
sapiens 


Membrane -bound protein 
PR0943 . 


2389 


99 


3 


AF113136 


Homo sapiens 


IL-1 receptor-associated- 
kinase-M; IRAK-M 


3043 


100 


4 


AF017806 


Mus rousculus 


Zn-15 transcription factor 


6351 


77 


5 


X02761 


Homo sapiens 


fibronectin precursor 


10535 


98 




X02761 


Homo sapiens 


fibronectin precursor 


8990 


89 


a 


X02761 


Homo sapiens 


fibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


^88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 15 -encoded protein. 


2381 


1Q0 


11 


AP117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466tli.4 (novel protein 
similar Co ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


894 


100 


13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13. 


1994 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


1 00 


15 


AF233453 


Homo sapiens 


RACK- like protein PRKCBP1 


3124 


a a 


17 


AF201303 


Homo sapiens 


dhfr oribeta -binding protein 
RIP60 


3130 


98 


18 


AF064205 


Homo sapiens 


dynaccin l piso isoform 


6377 


100 


19 


U00059 


Saceharomyce 
s cerevisiae 


Yhrl21wp 


174 


26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2+/ calmodulin- dependent 
protein kinase kinase beta 


30 83 




23 


AF14D507 


Homo sapiens 


Ca2+/ calmodul in - dependen t 
protein kinase kinase beta 


2300 


99 


24 


AJ289131 


Homo sapiens 


chondroicin 4-o- 
sulfotransf erase 


2211 


99 


25 


U33460 


Homo 
sapiens 


DNA-directed RNA polymerase 
I, largest subunit 


8777 


98 


26 


Y44488 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 


27 


0*43701 


Homo sapiens 


ribos omal protein L23a 


791 


100 


28 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77. 


1093 


99 


30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


715 


90 


31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


631 


82 


32 


AF231917 


Homo sapiens 


l.ong- chain 2- hydroxy acid 
oxidase HAOX2 


1811 


100 


33 


Z294B1 


Homo sapiens 


3-hydroxyanthranilic acid 
di oxygenase 


1507 


99 




AT) A A 4 A C ^ 

AB001451 


Homo sapiens 


Sck 


2869 


100 


35 


Y00544 


Homo sapiens 


precursor polypeptide {AA -34 
to 287) 


1667 


99 


36 


YO0S44 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 -"' 


31 


Y78795 


Homo sapiens 


Human antiauai-2 (AZ-2) amino 
acid sequence . 


3 586 


78 


38 


Y78795 


Homo sapiens 


Human anti2uai-2 (AZ-2) amino 
acid sequence . 


4726 


99 
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TABLE 2 



" seO 

ID 

NO: 


ACCESSION - 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


39 


Y78795 


Homo sapiens 


Human antizuai-2 <AZ-2) amino 
acid sequence . 


3556 


77 


40 


U93121 


Homo sapiens 


M-phase phosphoprotein-l 


3747 


100 


41 


Y42750 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1) . 


795 


100 


42 


AF282626 


Homo sapiens 


latex in 


1189 


100 


43 


G0215O 


Homo sapiens 


Human secreted protein, SBQ 
XD NO: 6231. 


384 


94 


44 


U19617 


Mus mus cuius 


El£-1 


2724 


88 


45 


U19617 


Mus mus cuius 


Elf-1 


2062 


86 


46 


AF1O0758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


Y87591 


Homo sapiens 


Human SPROUTY-1 protein, SEQ 
ID NO:24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 


51 


X63547 


Homo sapiens 


oncogene 


5645 


99 


52 


M94043 


Rattus 
norvegicus 


rab-related GTP- binding 
protein 


1089 


96 


53 


L31783 


Mus mus cuius 


uridine kinase 


917 


71 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


98 


55 


AF224741 


Homo sapiens 


ehlOfifif* rhnnnpl nrnt-^in "7 




99 


5£ 


W74805 


Homo sapiens 


Human secreted protein 

encoded hv nr>np *7"7 r*1 nnp 

HOEAS24 . 


1491 


100 


57 


Z509D7 


Homo sapiens 


Human TBP-1 pflMl f rnm a&f-ri-nrl 

transcript. 




100 


58 


D79994 " " 


Uull^ »J U A. Zj^X 1 0 


*3 m i "1 n it ho anVirrin nf 
0 xui-l J. fix. alUly4 1.11 

Chroraatium vinosum. 


buoy 


99 


59 


D79994 


Homo sapiens 


similar ro ai\levr*in 
Chromatium vinosum. 


4014 


91 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing CXXC 
domain l 


13 90 


100 


62 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


2492 


99 


63 


Y66660 


Homo 
sapiens 


Membrane - bound protein 
PR0783 . 


1709 


99 


64 


S7O011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF13951B 


Rattus 
norvegicus 


A-)cinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DH1308 1 clone 
Secreted protein. 


157 


30 


67 


AJ245738 


Homo sapiens 


claudin-15 


2206 


100 


68 


AF09913 8 


Rattus 
norvegicus 


glut 4 vesicle protein 


41B3 


87 


69 


AF099138 


Rattus 
norvegicus 


GL0T4 ve3icle protein 


4906 


86 


70 


Z82059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AP224278 


Homo sapiens 


PMEPAi protein 


1282 


loo 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MSK2 protein sequence. 


2065 


99 ! 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence, 


1207 


100 


75 


AF188622 


Mus mus cuius 


selectively expressed in 
embryonic epithelia protein-1 


1465 


74 


"76 


AE0OO406 


Escherichia 
coli 


putative DNA topoisoraerase 


950 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


ioo j 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


73 


AP129756 


Homo sapiens 


G4 ' 


1554 


i>9 
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TABLE 2 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DSSCRiPTtQlrf 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


BO 


AL096768 


Homo sapiens 


dJ8 58B16.2 
(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1.1.65)) 


2033 


100 


81 

an 


AI.096768 


" Homo sapiens 


OJB58B16.2 
(phospha t idyl ser.1 ne 
decarboxylase (PSSC, EC 
4. 1.1. 65)} 


1220 


96 


t><£ 


Ab / Jbl 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984 1 


2700 


98 


84 


X73113 


Homo sapiens 


fast MyBP-C 


5954 


99 


85 


AF097330 


Homo sapiens 


Hi chloride channel; p64Hl; 
CLIC4 


1305 


"99 


86 


AB018423 


Mus musculus 


SH2 domain- containing protein 


1360 


78 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


3084 


" 99 


88 


AF196329 


Homo 
sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


89 


AB016879 


Arabidopais 
thai iana 


contains similarity to pre- 
mRNA splicing 
£actor~gene_id:MRB17 . 2 


634 


36 


90 


AJ133721 


Mus musculus 


homeodomain protein 


654 


57 


91 


AJ242864 


Mus musculus 


phtt protein 


619 


""61 


92 


A61971 


unidentified 


MCSP 


11676 


99 


93 


Y99365 


Homo sapiens 


Human PR01250 (UNQ633) amino 
acid sequence SEQ id NQ:86. 


3890 


100 


94 


Y87231 


Homo sapiens 


Human signal peptide 
containing protean HSPP-8 
SEQ ID NO:8. 


1031 


100 


95 


AF227741 


Rattus 
norvegicus 


protein icinase WNKl 


2428 


95 -"" 


96 


AF227741 


Rattus 
norvegicus 


protein kinase WNK1 


1961 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-io. 


1626 


100 


98 


AL021366' 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein) 


3423 


100 


99 


AC005783 


Homo sapiens 


R33083 1 


["1974" 


99 


100 


Y9S293 


Homo sapiens 


Human GEF containing Jtek-like 
kinase substrate sGNX. 


4092 


99 


101 


AL11BS01 


Homo sapiens 


dJH9iN16.l (a novel protein 
(translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 


1509 


100 


102 


AJ006267 


Homo sapiens 


ClpX-like protein 


3233 


100 1 


103 


API 00 753 


Homo sapiens 


ancient ubiquitous 46 kDa 
protein AUP1 


2042 


96 


104 


AB015982 


Homo sapiens 


serine/ threonine kinase 


4718 


100 


3 05 


AF151074 


Homo sapiens 


KSPC240 ' 


831 


64 


106 


M35522 


Canis 
familiarls 




354 


50 


107 

1 AO 


R99800 


Homo sapiens" 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


xvo 
i no 


Arl25533 


Homo sapiens 


NADH- cytochrome b5 reductase 
isoform 


1290 


93 


-A v 2f 


A(_005b 14 


Homo sapiens 


F23269 2 


3369 


99 


110 
115. 


AP064729 
XS2425 


Homo sapiens 
homo sapiens 


RAN binding protein 16 
interleukin 4 receptor 


3285 
449£ 


100 


112 


Y416B6 


Homo 
sapiens 


Human PR0274 protein ' 

a pent onr-o 


2285 


100 
100 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase BRKl. 


1991 


100 




Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 


115 


AL049548 


Homo sapiens 


OJ3 98G3.1 (ortholog of rat 
CPG2) 


3497 


99 


116 


AF189817 


Mus musculus 


evectin-2 


1124 


90 


117 


W30891 


Homo 


Human cytostatin llIX protein. 


715 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






sapiens 








1X8 


AF116618 


Homo sapiens 


PRO1038 


1469 


100 


119 


Y0B915 


Homo sapiens 


alpha 4 protein 


1748 


100 


12C 


AF098Q70 


Droaophila 
melanogaster 


Lisl homolog 


192 


39 


121 


AF052432 


Homo sapiens 


katanin p80 subunit 


181 


37 


122 


Y70741 


Homo sapiens 


PSEQ-1 protein encoded by 
NSBQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


124 


Y27096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


125 


M63109 


Leishmania 
major 


glycoprotein 36-92 


172 


27 


126 


U75467 


Drosophila 
melanogaster 


ACU 


935 


36 


127" 


£68220 


Caenorhabdi t 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF09S927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W929S8 


Homo sapiens 


Human zsig44 protein. 


463 


100 j 


130 


AF115391 


Lactobacilli! 
s sakei 


ribokinaoe RbsK 


508 


37 


131 


X93498 


Homo sapiens 


21 -Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


21 -Glutamic Acid-Rich Protein 


916 


87 


133 


W52 911 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIHI . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


3230 


100 


135 


"M69181 


Homo sapiens 


non-muscle myosin B 


189 


20 


136 


W74882 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83. 


480 


100 " " 


137 


W78200 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGAU61 . 


855 


39 


138 


AL033S20 


Homo sapiens 


CU349A12.1 (similar to 
KXAA0701 protein) 


424 


39 


139 


AF020261 


San t alum 
album 


proline rich protein 


119 


30 


140 " 


X70394 


Homo sapiens 


zinc finger protein 


1634 


100 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


936 


100 


142 


Z68493 


Caenorhabdi t 
is elegans 


predicted using Gene finder 


365 


42 


143 


AB018107 


Arabldopsis 
thai! ana 


ADP-ribosylatlon £actor-like 
protein 


596 


65 ' " 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


145 


Y84902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


147 


AC007357 


Arabidopsis 
t ha liana 


F3F19.1B 


647 , 


31 


145 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


149 




Homo sapiens 


cAMP-specif ic 
phosphodieat erase 8A 


3710 


99 


150 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HKH-7 . 


785 


99 


151 


U10397 


saccharomyce 
s cerevisiae 


Yhrl46wp 


515 


53 


152 


X73478 


Homo sapiens 


phosphotyroeyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ382H0.5.i (novel protein 


2034 


99 
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SEQ 
ID 

NO: 


ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 








similar to arginyl-tRNA) 






1*4 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


15S 


X94703 


Homo sapiens 


ran 2 8 


1126 


99 


256 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


"1411 


idW 


lit 


b*7?4u4 ■ 


Homo sapiens 


Secreted salivary polypeptide 
zsig32. 


937 


100 


159 


v 17248 


Homo sapiens 


Human protein kinase 
iuhibitor-2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


carboxypeptidase M precursor 


2395 " 


100 


161 


W54040 


Homo sapiens 


Human interferon -inducible 
protein, HI PI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ413H6.1.1 (hamster 
Androgen-dependent Expressed 
Protein LIKE potativf. 
protein) (iso£orm 1) 


1357 


100 


163 


AF125535 


Home sapiens 


pp21 homo log 


133 


45 


164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713. 


4 63 


97 


165 


AJ2S0839 


Homo sapiens 


serine/ threonine protein 
kinase 


1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


W88645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUKFC71. 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP-dependent RNA hel lease 


4402 


100 


170 


AB000871 


Met hanobac t e 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


"27 


171 


Y27684 ■ 


Homo sapiens 


Human secreted protein 
encoded by gene No. 113. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 i 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 


Homo sapiens 


This gene is novel . 


3202 


100 


175 


Y07923 


Homo sapiens 


GTP-binding protein 


1205" 


100 


176 


W30338 


Homo 
sapiens 


Human DPI homologue protein. 


966 


100 


177 


Y41675 


Homo sapiens 


Human channel -related 
molecule HCRM-3 . 


1122 


100 ! 


178 


Y41674 


Homo sapiens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Homo sapiens 


krueppel-like zinc finger 
protein H3F2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq H-chain precursor 


1240 


100 


181 


U573 44 


Mus musculus 


Meie3 


1813 


89 


183 


U57344 


Mus musculus 


Meis3 


1743 


86 


104 


U57344 


Mus musculus 


Meis3 


1070 


86 


185 


AF033120 


Homo sapiens 


p53 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


187 


W75058 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33 . 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


XS4134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium-binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


1195 


WB7772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 
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SEQ 
XD 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


194 


AF084259 


Mus niua cuius 


bromodoma i n - c on t a i n ing 
protein BP75 


693 


54 


195 


Y0 0752 


Rattus 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


196 


W95349 


Homo sapiens 


Human foetal brain secreted 
protein fhl70J7. 


■" 259(1 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


198 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236_l. 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 


201 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6142. 


558 


99 


203 


X13885" 


Nicotians 
t aba cum 


extensin (AA 1-620) 


185 


33 


204 


J04204 


Bos taurus 


32 itd accessory protein 


1837 


100 


205 


J04204 


Bos caurus 


32 kd accessory protein 


1101 


100 


207 


Y872B3 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60. 


1318 


100 


208 


Y0286O 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


98 " 


209 


AL121889 


Homo sapiens 


dJ1076E!7.1 (KIAAC823 protein 
(continues in AL023803)) 


694 


54 


210 


AF226732 


Homo sapiens 


NPD007 


1345 


76 


211 


X66295 


Mus musculus 


Clq C chain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiquitin-conjugating enzyme 
UbcH2 


966 


100 


213 


Z29328 


Homo sapiens 


Ubiqux tin- conjugating enzyme 
UbcH2 


542 " ~ 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3933 


100 


216 


AF2505S8 


Homo sapiens 


claudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJ82lDll.l (PUTATIVE protein) 


259 


100 


218 


Y06565 


Homo sapiens 


UDP-GaXNAc : polypeptide N- 
ace tylgalac tos amlnyl trans f er a 

S9 


3331 




219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035521 


Arabidopsis 
thai i ana 


putative protein 


315 


42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline- trna 
synthetase 


"sii 


41 


222 


AL109736 


Schizosaccha 

romyces 

porabe 


WD repeat protein 


626 


40 


223 


X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 


224 


AL0356S9 


Homo sapiens 


oM979Nl.i (dJ979Nl.l) 


5199 


98 


225 


AB032401 


Mus musculus 


mmDj4 ^ 


1761 


92 


22$ 


AB032401 


Mus musculus 


nimDj4 


1988 


92 


227 


X835G2 


Saccharomyce 
s cere visa ae 


J1007 


112 


26 


228 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


79 


25 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane- bound protein 
PR0828 . 


982 


100 


231 ■ 


AB02746S 


Homo sapiens 


sponctin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 


K00365 


Homo sapiens 


Human cyclin Bl. j 


2218 


99 


234 


Y53762 


Homo sapiens 


A GTP-binding polypeptide 


1017 


Too™ 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


" SPECIES 


DESCRIPTION 
designated RAQ. 


WATERMAN 
SCORE 


IDENTITY 


235 
236 


Z50749 
Z50749 


Homo sapiens 
Homo sapiens 


yeast s<is22 homolog 
yeast sds22 homolog 


1800 
17 S4 


100 
98 


237 


AJ3026491 


l^omo sapiens 


PICK1 


2137 


100 


23d 


"AJ270205 


Entodinium 
cauda bum 


putative 

phospha t i dyl inos i tol - 4 - 
phosphate 5-kinase 


114 


37 


233 • 


AB030189 


Mus musculus 


contains transmembrane (TH) 
region and ATP binding region 


710 


93 


240 


WS6538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 


W5*538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 


AF155107 


Homo sapiens 


NY -REN- 3 7 antigen 


996 


99 


"243 


AF155107 


Homo sapiens 


NY-REN-37 antigen 


1005 




244 


ALO31320 


" Homo sapiens 


dJ20N2.i (novel protein 
similar to yeast and 
bacterial cytosine 
deaminase) 


763 


99 


24* 


U37026 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


-Trt 


246 


AL078599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
P5SA12.9 (Tr:P91086)> 


2391 




247 


U32274 


Saccharomyce 
s cere visa ae 


Ydr386wp; CAK: 0.12 


191 


37 


248 


Y41719 


Homo 
sapiens 


Human PR0864 protein 
sequence . 


1079 


100 


249 


AB029434 


Homo sapiens 


ghrelin precursor 


611 


100 


250 


X97831 


Rattus 
norvegicus 


carnitine/ acylcarnitine 
carrier protein 


246 


38 


251 


W80993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF. 


1724 


100 


252 


Y94873 


Homo 
sapiens 


Human protein clone HP02632. 


1876 


100 


253 
"254 


W59878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIP-2 (HEBGM49) . 


"t?5 


100 


255 


AL354533 

AF233322 


Leishmania 
major 

Mus mus cuius 


possible adenylate Jcinase 
zinc transporter lifce 2 


265 
"1916 


34 


256 


Y78113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEQ ID 
NO:l. 


2247 


95 
99 


257 


AL035539 


Arabidopsis 
thai i ana | 


putative amino acid transport 


390 


27 


258 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 

tirlr HNo 1 - 


1171 


100 


259 




iiomo sapiens 


d»Jl87Jii.i (novel protein 
similar to protein kinase C 
inhibitors) 


974 ■" 


100 


250 


AE000909 


Methanobacte" " 
rium 

t hermoau t o t r 


serine/threonine protein 
A.AnuDt: tt?j-ciueci procsin 


363 


30 


261 
262 


AL05O131 
AF019661 


ophicum 
Homo sapiens 
Mus musculus 


hypothetical protein 

zeta proteasome chain; PSMA5 


62* 


100 


263 

264 " 


AL03559^ 
3&022318 " 


Homo sapiens 
Homo sapiens 


a«J3i0J6.i (novel protein) 
bK150C2.3 (PUTATIVE novel 


1214 
821 

1072 " " 


100 
100 

iob 








protein similar to APOBECi) 




265 
^266 


AF2 0594 0 
AJL023563 


Homo sapxens 
Homo sapiens" " 


endomucin 


1289 


100 


267 


AL034548 


Homo sapiens 


OJb00lil4.i (novel protein) 
CUJ1103G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


789 
1888 


100 
99 
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SEQ 
ID 
NO: 
268 


ACCESSION 
NUMBER 

AF161470 


SPBCIES " 
Homo sapiens 


DESCRIPTION 

HSPC121 


SMITH- - 
WATERMAN 
SCORE 
1884 


IDENTITY 
98 


269 
2 70 

271 


AF161470 
HrAU /t>00 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


HSPC121 — 
HHa5 hair fceratin type I 
intermediate filament 
ethanolaraine kinase 


1232 
2190 


96 

" 99 


"272 ■ 
273 


" M52334 

AF1614B3 


Homo sapiens 
Homo sapiens 


intercellular adhesion ' 

molecule 2 

HSPC134 


1952 
1430 


100 
100 


-274 
276 


Y530S2 * 


Homo sapiens 


Human secreted protein clone ~ 
df202_3 protein sequence SEQ 
ID NO: 110. 


663 
587 


61 
100 


277 


Y77576 


Homo sapiens 


Human cytoskeletal protein 
(KCYT) (clone 2195418) . 


762 


100 




AF077042 


Homo sapiens 


3 OS ribosomal procein S7 
homolog 


1269 


100 


278 


Y94907 


Homo sapiens 


Human secreted protein clone - 
cal0£^l9x protein sequence 
SEQ ID NO: 20. 


1619 


98 


280 


JLOOV88 


Homo sapiens" 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-20. 


2801 


99 


281 


Z75134 


Can is — — - 
fatniliaris 


rod traneducin 


1816 


100 


282 

"283 

284 


Z75134 

AF249873 
AL050007 
AF201931 


Can is 

f amiliaris 
Homo sapiens 
Homo sapiens" - 


rod transducin 

[muscle-specific protein 
hypothetical protein 


1718 

139$ 
"405 


96 

100 ( 
98 j 


285 
286 

287 
288 


AF156102 
Y35897 

U88964 
AL050143 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


DC1 — 

ELL complex EAP30 subunit 

Extended human secreted 
protein sequence, SEQ ID NO. 
146. 

HEM45 : ' ■ 

hypothetical protein 


1859 
1318 
1250 

923 


99 j 
99 '" - 
99 

100 


289 
290 

291 


AJ011O38 
Y66724 

AF034801 


Homo 
sapiens 
Homo sapiens 


Lcj.cLuunin 

Membrane -hound protein 

PR083C. 

lipr in- alphas 


598 
574 
2321 

2565 


100 
100 
100 

9B 


292 
293 

294 


AF034001 
AL049851 

"¥73348 


Homo sapiens 
Homo sapiens 


liprin-alphet4 

dj889J22B.l (novel protein 

(isoform 1) ) 


" 2590 
1738 


100 
100 


29S 


L11G72 


Homo sapiens 
Homo sapiens 


^TRM clone 839651 protein 

sequence. 

zinc finger protein 


"1245 


99 


297 


AliOJ 5423 


Homo sapiens 


dJ20I3.i (brain mitochondrial 
carrier protein-1 (BMCPi) > 


1694 
1024 


44 

79 


~298 


AF198532 


Homo sapiens 


lVTTlDhoid p> n Yi^i r\ f*f*y~ Kin^inn 
K^iu^iiuAu cuijaiiuct imidiiiw 

factor- l 


2173 


100 


299 " 


ID ,91 41 


Homo sapiens 
Homo sapiens " 


HSPC299 — 


"1147 


"85 


300 


breast cancer metastasis- 
suppressor 1 


"1236 


99 ' 


301 




Rattus \ 
norvegicus 


Inositol polyphosphate 4- 
phosphatase 


160 


30 


3 02 


AF036145 
Z82022 


Homo sapiens " 
Homo sapiens " 


r™ expressea antl 9en 
uicNac-1-i* transferase 


3458 


100 


303 " ■ 


AF269232 


mus musculus 


butyrophiiin-like protein 
BUTR-1 


2067 
271 


99 
50 


304 

305 " 


AJ222544 


Arabidops is 
thaliana 


asparagmyi-tRNA synthetase 


659 


50 


306 


AF05418O - 
&J2 ^2079 


Homo 
sapiens 

tiomo sapiens * 


hematopoietic cell derived 

sine finger protein 

APOBEC-1 stimulating protein 


351 


79 


308 

309 1 


X44486 

W31891 p 


Homo ] 
sapiens j 
tomo Bapiens "1 


^uman GPRW receptor 
30l /peptide. 

DNA polymerase mu " ~~ ~~ 4 


3056 
1721 

5591 


LOO 
LOO 

fbli 
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TABLE 2 



ID 

NO: 


NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE . 


% 

IDENTITY" 


310 




nomo sapiens 


P J U Vo{, 


12*48 


92 


311 


r^r X < D J i J 




c-oox. procem cBuiz 


1501 


93 


"3T2 


57802 




immunogioDuiin lantDaa lignt 
chain 


959 . 


8*1 


3i3 


Z3671S 


Homo sapiens 


Net 


2048 


98 


314 


ft JT1 #C1 


womo sapiens 


HbPLU4 / 


727 


100 


315 


nf t&UOUDD 


rioruo sapiens 


Jceicn-xiJce protein KLHL3a 


3046 


100 


2X6 


X ttOOOO 


sap i en a 


Membrane -bound protein 
PRO1013. 


1166 


100 


•5X / 


Y29666 


Homo sapiens 


Human Ras protein RAPR-i, 


1253 


58 


318 


AJ387747 


Homo sapiens 


siaiin 


2614 


99 


319 


Ap 161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


^kmino acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 


321 


AJ238379 


Homo sapiens 


putative THl protein 


3013 


100 


322 


AB040812 


Homo sapiens 


protein kinase PAK5 


3792 


99 


323 


Y95013 


Homo sapiens 


Human secreted protein 
vc48_l, SEQ ID NO: 66. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


1976 


100 


325 


Y94 944 


Homo sapiens 


Human secreted protein clone 
bfl57^16 protein sequence 
SEQ ID NO: 94. 


2305 


98 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
profcein-7sequence . 


6728 


99 


327 


AP198532 


Homo sapiens 


lymphoid enhancer binding 
factor-1 


2173 


100 


328 


Z780li 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus mus cuius 


MMTV receptor variant 1 


484 


94 


330 


275330 


Homo 

sapiens] 

>R6S207 

R65207 02- 

MAR- 1995 27- 

AUG-1993 

Human 

stromalin-i. 

(Homo 

sapiens 


nuclear protein SA-i 


6492 


99 


331 


AL008583 


Homo sapiens 


dJ3270i6.3 (supported by 
GBNSCAN, FGENBS and GENEWtSE) 


2133 


99 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


>iU Z. /lob? 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Kus musculus 


p53 -regulated DDA3 


997 


64 ~l 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the : 154 
Eimeria tenella gene etlOO j 


26" 


336 


Y85S64 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/l) sequence. 


3386 


97 


337 



Y8S564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/i) sequence. 


2602 


94 


338 


YS5564 


Homo sapiens 


Human homologue of UNC-53 
(Hs -UNC-53 /l) sequence . 


3447 


98 


339 


266561 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor- 3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5027. 


465 


98 


342 


AP020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L2 9154 


Homo sapiens 


immunoglobulin heavy chain 


439 • 


84 
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TABLE 2 



SE<5~ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
VDJ region 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 


344 
"345 


U10281 


sua scrofa 


gastric mucin 


iV9" 






Ak60C404 


Homo sapiens 


unnamed protein product 


1177 


99 


"34? 


L225S7 


" Rat t us 
norvegicus 


calmodulin -binding protein 


1949 


R4 

o** 


347 


L22557 


Rattus 
norvegicus 


calmodulin- binding protein 


2363 


91 


348 


" AL049481 


" Arabidopsis 
thaliana 


AIGl-like protein 


316 


30 


350 


AJ2S1516 


Mus musculus 


cysteine and histidine-rich 
protein 


1460 


99 


351 


AK024477 


Homo sapiens 




1773 


100 


35^ 


U50133 


Homo sapiena 


ankyrin 


502 


33 


353 


AJC000625 


Homo sapiens 


unnamed protein product 


721 


100 


3^4 


AF161420 


Homo sapiens 


HSPC302 


2623 


97 


355 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


355 


AF151029 


Homo sapiens 


HSPC195 


941 


91 


357 


AL022327 


Homo sapiena 


dJ35SC10.1 (KIAAOOS>'7I 


1911 


100 


350 


V/7812B 


Homo sapiens 


Human secreted protein 
cuwucu Lj y ytsnc -s cxoiiB 
HOSBI96. 


1117 


100 


359 


X03414 


Drosophila 
melanogaster 


Kr oolvoeDtide 


Jib 


45 [ 


360 


AF151079 


Homo sapiens 


HSPC245 


643 


100 


361 


Y53 896 


Homo sapiena 


A suppressor of cytokine 
sicrnal liner urotpin 
designated HSC0P-6. 


530 


41 


352 


AF2S4741 


Drosophila 
melanogaster 


Centaur in Gamma 1A 


681 


46 


363 


AF213465 


Homo sapiens 


dual oxidase 


2016 


100 


364 


AF1 81562 


Homo sapiens 


proSAAS 


;i3i9 


100 


365 


AF181562 


Homo sapiens 




1024 


99 


366 


U73200 


Mus musculus 


pll6Rip 


884 


82 


367 


AF263744 


Homo sapiens 


erbb2- interacting protein 
ERBIN 


4973 


99 


368 


U37501 


Mus musculus 


laminin alpha 5 chain 




72 


369 


AF043695 


Caenorhabdit 
is elegans 


Similar to t* Vi p» nrnt-o^n 
9*»u4Qi. tns protean 

phosphates 2c family 


549 


36 


370 


Y73440 


Homo sapiens 


Human secreted protein clone 
y}23 i protein sequence SEQ 
ID NO: 102 . 


1484 


99 


371 


AF272833 


Homo sapiens 


misato 


2869 


97 


372 


AF198454 


Homo sapiens 


epithelial protein lost in 
neoplasm beta 


3927 


100 


373 


Y7334S 


Homo sapiens 


HTRM clone 438283 protein 
sequence . 


273 


80 


374 


AF169017 


Homo sapiens 


f ormlminot ran 6 f era se 
cycl ©deaminase 


2717 


98 - 


375 
376" 


A951G6 


unidentiried 


RED ALPHA 


1202 


99 


377 


W74828 
Y32131 


Komo sapiens 
Homo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HL0A352 . 

Human LYST-2 protein. 


1012 




378 
3?$ 


M14912 
AF090934 


Homo sapiens 
Homo sapiens 


pol 

PRO0518 


3S5S 
132 


99 
86 

TiTn 


380 

O Q1 




Horao sapiens 


serine/threonine protein 
kinase 


382 
2499 


100 


"382 


Y41699 


Homo 
sapiens 


Human PR0703 protein ' 
sequence . 


2362 


100 




AF17449B 


Homo sapiens 


GR AF-i specific protein 
phosphatase 


7008 


98 


383 


U64608 


caenorhabdit 
±3 elegans 


coded for by C. elegans cDNA 
ykl73cl2.S 


246 


36 " 


"385 


tf50133 


Homo sapiens 


ankyrin 


502 


33 




AJ238S20 


Homo sapiens 


putative transcription 
factor- like nuclear regulator 


4123 


97 
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TABLE 2 



PCT/US00/34263 



SEQ. 
ID 

NO: 


KDMBER 




DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


t 

IDENTITY 


387 


AF208845 






1375 


99 


389 


XS7821 


Homo sapiens 


xmiiunuy lujjuj. in lanifjua lignc 

chain 


797 


76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 


1670 


99 


391 1 


" VflS5g4 




(Ha-UNC-53/1 1 qprni^nnp 


3366 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophll a 
melanogaster 




1616 


62 


39S 


AF181721 


Homo sapiens 


R02S 


2254 


100 


39$ 


Y69197 


nuino sapiens 


Amino acid sequence of a 
human be ta I v- spectrin 
protein. 


1626 


98 


3 97 


U4 8 2 'A 8 


Mus musculus 


zinc tincjer protein neuro-d4 


749 


60 


198 




Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217S25 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


*0 


400 




Schizosaccha 

romyces 

pornbe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase similar to 
Q02218 (PID:gl352618) 


4176 


" 78 


402 


AB010266 

Hi 1 i ^lono 


Mus musculus 


tenascin-X 


10246' 


"S2 


40 15 


AIil33288 


Homo sapiens 


dJ67iD7.l (similar to 
D. melanogaster CG5986 
protein > 


761 


100 


404 




Caenorhabdit 
is el eg an 3 


ZCS18 .3b 


888 


48 


40 5 


a / owl j 


Ca enorhabdi t 
■is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
s uppr e s s or 


569 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF1551Q6 


Homo sapjLens 


antigen 


1168 


160 


408 


Y57945 


nuiiKj sapiens 


Human transmembrane protein 
HTMPN-69. 


1538 


99 


409 


Z18361 


Ovis aries 


trichohyalin 


184 


30 


410 


AF249744 


noino sap i ens 


KJIOGEF 


2733 


100 


411 


AF176529 


mus musculus 


F-box protein FBX13 


2072 


94 


412 


AF210842 


xivJitiL) sapiens 


HARP 


4880 


100 


413 


AL031659 


Homo sapiens 


cLl310O13.7 (novel protein 
similar to H. roretzi HRPET- 
31 


775 


98 


414 


X57398 


Homo sapiens 




6131 


99 


415 


AB029826 


Eomo sapiens 


3 -methyl crotonyl - CoA 
wotoaxyicise Diotin- containing 
subunit 


2961 


99 


41* 


U43503 


Saccharomyce 
s cerevisiae 




115 


42 


417 


AL160493 


Leishmania 
major 


^uooiuie t^oii / . zi 


23 9 


35 


418 


Y0810Q 


Homo sapiens 


Human nrnf&in 


330 


29 


419 1 


U15131 




pl26 


2228 


54 


420 


AF117946 


Homo sapiens 


Link guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF190635 


Drosophala 
melanogaster 


ankyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate - 
binding protein-2 


1962 


100 


423 


AL13753 0 " 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


son- a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


' 426 * - 


AF279144 


Homo sapiens 


tumor endothelial .marker 7 
precursor ] 


1084 


55 
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TABLE 2 
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SEQ 
ID 

HO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WAIhRX'lAN 


IDENTITY 


C27 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


426 


AEO03683 


Drosophila 
melanogaster 


CG8312 gene product 


149 


29 


429 


Y07829 


Homo sapiens 


RING fringer protein 


2201 


99 


43b 


'AF096897 


Drosophila 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu protein 


4021 


99 


432 


AF023674 


Homo sapiens 


nephrocystin 


3783 


100 


433 


AF146760 


Homo 
sapiens 


oc^jliii ^ AAjte ceAj, envision 
control protein 


2284 


100 


"434 


AB006697 


Arabidopeis 
thaliana 


cleft lip and palate 
aooui.iai.cu 1 1 aitomeniDrane 
orotein-lik?* 


886 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP . 


1704 


100 


"438 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
uyiga j.ac uosaminy l c r ans £ era 

se 


1075 


"63 


439 


AF105228 


Bos taurus 




285 


33 


440 


R06463 


Homo sapiens 


u-tivea piotein oi clone 
ICA13 (ATCC 40553) . 


3073 


99 


441 


X14971 


Mus musculus 


aipjia-aaapcin IA/ (AA 1-377} 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


aipna-c large chain (AA 1- 
938) 


3979 


81 


443 


Y66639 


sapiens 


Membrane- bound protein 
PR01136 . 


3299 


99 


444 


AC067754 


Arabidopsis 
thai iana. 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus musculus 


piL 


2077 


33 


446 


AF05SQ3S 


Rat tus 
norvegicus 


5-nexiiin 


2662 


85 


447 


AF132484 


Mus musculus 


li^l/ll l(J wll 


4 78 


51 


448 


WBD024 


Homo sapiens 


Polypeptide fragment encoded 
yene i_>b . 


528 


45 


443 


AF161445 


Homo sapiens 


HSPC327 


1606" 


100 


450 
"451 - 


Z68753 


is elegans 


iwlO • JO 


951 


49 




W39160 




nuukUA uxaji complement 
factor H protein fragment 3. 


155 


32 


4S2 


W85727 


Homo 
sapiens 


BM46JL0) . 


2799 


99 


453 


Y53629 


Homo sapiens 


A bone marrow secreted 
protein desianafc#*ri ftMQi i ^ 


2810 


100 


""454 


D87438 


Homo 
sapiens 


similar to a C. elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF240468 


Homo sapiens 


ni Gastrin "~" 


3687 


100 


456 


Z15605 


Homo sapiens 


CEtfP-E 


13305 


99 


457 


M59216 


Homo 
sapiens 


sjcauuiin ClUtlilWOU xc dClQ 

receptor beta-1 subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 

VdSl 1 T3rOhpir» QPrni^nro CPA 

ID NO:156. 


966 


100 


459 


W^7824 


Homo sapiens 


Human SSPrPhprf nrnh«Trr™ 

encoded by gene 18 clone 
HSLFM29. 


535 


100 


460 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor j 


279 4 ~ 


19 


461 


D8744 6 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


G04044 


Homo sapiens 


Human secreted protein, seq 

ID NO: 8125. 


486 


93 


463 


AC002398 


Homo sapiens 


F25965 1 


1018 


100 


464 


AF064856 


Rattus sp. 


7acomp protein 


1845 


84 


465 


AF223408 


Homo sapiens 


B99 * ' " 


3686 


99 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES" 


DESCRIPTION 


SMITH- 
SCORE 


IDENTITY 


466 


" AF223408 _ 
AF104415 


" Homo sapiens 
Mus musculus 


"~B99 " 

gene trap locus- 13 


2878 


87 


468 


U53450 

• 


Rat tua 
norvegicus 


JDP-l 


6336 

hqg 


91 
4 9 




AL031297 


Homo sapiens 


" CU97P20.1 (novel gene) 


3S64 


99 


470 


AF257077 


" Homo sapiens 


eukaryotic translation 
initiation factor EIF2B 
subunit 3 




95 


471 


L28125 


Podospora 
anserina 


beta transducin- like protein 


2 84 ■ ■— 


38 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptoeis related protein. 


2337 


100 


473 


AF144237 


Homo sapiens 


LOMP protein 




44 


474 * 


Y71213 1 


Homo sapiens 


Human irritable bowel disease 
related nolvoeiDtidp tmyto 


838 


100 


47S - 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO: 52 . 


3411 


100 


476 


D38549 


Homo sapiens 


hal025 is new 


6533 


99 


477 


AP241230 


Homo sapiens 


TAXI- binding protein 2 


3656 


100 


478 


AL031S34 


Schizosaccha 

romyces 

pombe 


putatxve asparagine synthase 


482 


40 


479 


L23X2S 


Podospora 
anserina 


toftfcs f" ran crti ir» t n — 1 ^ nv^^ai h 


233 


26 


480 
481 


AF161544 
AJ23824B 


Homo sapiens 


HSPC059 


434 


77 


482 
483 


Z38061 
AF161381 


Sac char omyce 
o cerevisiae 

Homo sapiens 


malS, seal, len: 1367, CAI: 
0.3. AMYH YEAST PflRCan 
GLUCOAMYLASB SI (EC 3.2.1.3) 
HSPC263 


3986 
295 


99 
23 


484 


AF22346B 


Homo sapiens 


AD021 protein 


1404 
1314 


100 
100 


486 


X57S27 


Homo sapiens 


alpha l(VIII) collagen 


4166 


99 


487 


Y19062 






2475 


100 1 


488 


Y73373 


Homo sapiens 


•***«• j^iouj pro«_em 
sequence . 


555 


56 


489 


AL021918 


Homo 
sapiens 


b34I8.l (Kruppel related 5*inc 

Pinoer nrnhai' n 1 a a \ 


4184 


100 


490 
491 


X53773 
U52426 


Rattue 
norvegicus 
Homo sapiens 


" *~tr latge cnain vAA X — 

938) 

GOK 


4675 


97 


492 


AL353773 


Leishmania 
major 


possible threonine synthase 


1453 
702 


59 
45 


493 


AF22£6l4 


Homo sapiens 


ferroportini 


2929 


100 


494 


Z93241 


Homo sapiens 


with some similarity to 
Drosophila KKAXKN) 


513 


96 


495 


AF036977 


Homo sapiens 


unknown 


1812 


100 


496 


U93S64 


Homo sapiens 


p40 




45 


497 


Y9140£ 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID WO: 126. 


357 


100 


498 


AF0697B1 


Drosophila 
melanogaster 


Sem46-liJce protein 


653 ■ ' 


43 


499 


YlttOl 


Homo sapiens 


Human cell -cycle 
phosphoprotein CECYP-2. 


1658 1 




500 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


3883 


100 


501 
502 


AF027503 
AF282874 


Mus 

musculus 
Homo sapiens 


putative membrane- associated 
guanylate kinase 1 
nectin 3; PRR3 


205 
28£o 


36 
99 


503 
504 
505 
"507 


AJ249732 
AF208861 
L09708 


Homo sapiens 
Homo sapiens 
Homo sapiens 


G8 protein 
BM-019 

complement component C2 ! 


6 69 

1629 

4022 


100 
100 

10 0 "" 


508 


X66285 
D00189 


Kus musculus 

Rattus 

norvegicus 


HC1 ORF 

Na+,K+-ATPase alpha- subunit 


IIS 
5227 


43 
99 
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TABLE 2 



SBQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DBS CR I PT I ON 


SMITH- 

Wttl E»IU*IAN 

SCORE 


IDENTITY 


509 

sid 


Y94971 


Homo sapiens 


Human secreted protein clone — 
fa 171 jl protein sequence SBQ 
ID N0:148. 


2176 


100 




ABO19038 


~ Homo sapiens 


beta-1,4 mannosyl transferase 


781 


77 ~~ 


511 


AB019038 


Homo sapiens 


^>eta-l,4 mannosyl transferase 


1347 


100 


512 


AB019038 


Homo sapiens 


1 beta-1,4 mannosyl transf erase 


1520 


~ —qE 


513 


' X849dB 


Homo sapiens 


phosphorylase kinase 


5729 


OQ 


514 


X52851 


" Homo sapiens 


pep t idyl prolyl isomerase 


650 


■ 76 


515 


AF186084 


Homo 
sapiens 


epidermal growth factor 
repeat containing protein 


3046 


99 


516 


G03602 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7683. 


505 


99 


517 


U04706 


Bos taurus 


50 kDa protein 


1749 


77 


518 


Q00G53 


Homo sapiens 


ID NO: 4734. 


D J U 


100 


519 


AF161475 


Homo sapiens 


HSPC126 


13 68 


100 


520 
"521 


Y99366 
AF2668S2 


Homo sapiens 
Homo sapiens 


Human PR01475 (UNQ746) amino 

neid flpmi^n^o ct!*a tt\ ma. q a 
HV,1U oc^u^nce oijy xu INU : HO. 

PTPLA 


3394 


97 


"522 


AB0O099S 


* • J* S^A IKA W ^-J -k-J LI 

s fulgidus 


chromosome segregation 
protein (smcl) 


1295 
153 


100 
20 


523 


AF0S2249 




ammuiiogxoDui in neavy chain 
variable region 


60S 


S7 


S24 


AJ223830 


Rattus 
norvegicus 


ARB1 


2950 


98 


525 


W01535 


Homo CAT) 4 on 


Cellular homoiogue ok the 
SV40 large T antigen. 


1276 


83 


526 


AF14 5658 


Drosophila 
melanogaster 


BcDNA.GH10229 


320 


33 


527 


AF112213 




putative Rab5- interacting 
protein 


52 1 


79 


523 


D49387 


Homo 


NADP dependent leukotriene b4 
12- hydroxydehydrogenase 


1616 


100 


S29 


Y30819 


Homo sapiens 


Human secreted protein 
encoded from gene 9 . 


328 


32 




AL079335 


Homo sapiens 


~dJ132F21.3 (72.1 KDa protein 
li/ftr4r:>o*A032 , SBBI8 8; 
similar to mouse I FN- gamma 


1059 


9S 1 


531 


^91506 


Homo sapiens 


Human secreted protein 
oc^ucu^c encoaea joy gene 5o 
SBO ID NO: 179. 


1159 


98 


532 


X76116 


Caenor habd i t 
is elegans 


carrier protein (c2) 


S76 


50 H 


533 
$34 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2> 


506 


50 ""J 




X12966 


Homo sapiens 


3-oxoacyl-CoA thiolase 
propeptide (424 AA) 


1972 


100 


535 


l\JJ4 o f 


Homo sapiens 


flavin- containing 
monooxygenase 2 


2486 


100 


536 
537 


Z11773 
D84224 


Eomo sapiens 
Homo sapiens 


SRE-ZBP 

methionyl tRKA synthetase 


2201 




538 


D84224 


Homo sapiens 


mecnxonyx tRKA synthetase 


4741 
3887 


99 — 1 

99 


539 
540 


D84224 
D84224 


Homo sapiens 
Homo sapiens 


methionyl tRNA synthetase 
methionvl tRNA svnth^t-AQA 


2933 


96? H 


541 
542 


J03244 
Y92514 


hoc taurus 
Homo sapiens 


H+ ATPase 31kDa subunit (EC 

3.6.1.3) 

Human OXRE-n. 


4529 
848 


99 

7? 


"543" 


AF221712 


Homo " 
sapiens 


Smad- and 01 f -interacting 
2inc finger protein 


2301 
2151 


99 J 
61 


~S44 


AE000919 


Methanobacte 
rium 


conserved protein 


207 


38 


""545 




thermoautotr 
ophicum 










A06669 


synthetic 
construct 


preTGF-betal 

i 


2070 | 


99 j 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRTPTTHM 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


"547 


Y02698 
AF112205 


Homo sapiens 
Homo sapiens 


Human secreted protein "~ 
encoded by gene 49 clone 
HTPCS60 . 
WSB-1 protein 


8*4 
2275 


98 

_ 

100 


548 
545 

~5Zo — 


X60271 
AC016827 


Mus musculus 
Arabidopsis 
thai iana 


c-rel 

putative GTPase 


22 64 
" 810- 


/ % 
42 


"551 


Y70400 
AB048365 


" Homo — 
sapiens 
Homo sapiens 


Human cell- signalling 
protein- 2 . 

NEDD4-like ubicuitin ligase 1 


429 




552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4. 


8290 
ill2 


99 
95 


553 


AF1198SS 


Homo sapiens 


PR01847 1 - 


265 


67 


554 


Ml 723 6 


Homo sapiens 


MHC HLA-DQ alpha precursor 


1332 


100 


555 


AL078468 


Arabidopsis 
thaliana 


putative protein 


540 


40 


556 


AC006963 


Homo sapiens 


similar to Xelch proteins; 
similar to TiftA"?"7n0 7 


SIS 


44 


557 


AK024487 


Homo sapiens 


(PXD:g4650844> 






558 


M12140 


Homo sapiens 


pol gene protein; Xxx 


1623 
117 


98 
48 


'559 
560 


W74825 
X56681 


Homo sapiens 
Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 
HAQBF73 . 
junD protein 


225 


56 


561 


AFO0313 6 


tACUUi J id JLJLL J_ L_ 

is elegans 


contains weak similarity to 
an AMP-binding motif 


373 
2926 


88 
54 


562 


AL109839 


Homo sapiens 


dJ1069P2.3.1 (novel PABPC1 

(poly (A) -binding protein) 


877 


100 


563 


AF181640 


Drosophila 
melanogaster 


BCDNA.GH09817 


289 


42 


564 
565 


AF052723 
AF16i47'2 


Feline 

leukemia 

virus 

Homo sapiens 


gag -pol precursor polyprotein 
gPr80 

HSPC123 


1547 
439 


43 
44 


566 
567 
569 
570 
571 


Y28817 

U09848 

AF155113 

AF155113 

AL032821 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


pt326_4 secreted protein. 
NY-REN-55 antigen 
dJ55C23.1 {vanin 1) 


3338 
1738 
3603 
3951 


100 
100 
93 
99 


S72 
573 
574 

"S75" 


M69181 
M691B1 
Y59678 


Homo sapiens 
Homo sapiens 
Homo sapiens 


non-muscle myosin B 

non- muscle myosin B 

Secreted protein 108-008-5-0- 

E6-FL, 


1821 
7350 
7311" 
772 


98 
99 
98 
100 




AL3 65234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


576 


AL3 65234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 
578 


AUfa /4b 

AB041642 


Homo sapiens 
Homo sapiens 


DNft polymerase alpha- subunit 
(AA 1 - 1462 > 
PAR- 6 


7619 


99 


"579 


Uao 70% 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


1342 

2446 ' " " 


100 
100 


580 
"581 


AF165124 


Homo sapiens 


receptor gamma 2 


2499 


99 


c o 


W88812 
U82319 


Homo sapiens 
Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 
novel ORF 


2339 


99 


~£83 
584 


P92219 
AJ22394B 


Homo sapiens 
(human] 
Homo sapiens 


CRl protein. 
RNA helicase 


342 
11425 


100 
99 


585 


Y08612 


Homo sapiens 


88kDa nuclear pore complex 
protein 


6608 
3874 


99 
99 


586 
587 


V42384 

AF12975I 


Homo 
sapiens 

Homo sapiens ~ 


Amino acid sequence of 

Iv3l0 7. 

BAT4 


1007 
1873 


37 
98 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SW*TH- " 
WATERMAN 
SCORE 


IDENTITY 


588 
589 


AK131775" 
AJ250865 


" Homo sapiens 
Homo sapiens 


" Unknown * 

" TESS"2 


1929 


99 


591 


Z98885 


Homo sapiens 


containing 1 (similar to 
peregrin, BR140)) 


2346 
41b / 


100 
100 


592 
593 


L76*7l 
AF091622 


Homo sapiens 
Homo sapiens 


nuclear normone receptor 

PHD finger protein 3 " 


""1355 


100 


594 


X56807 


Homo sapiens 


desmocollin type 2a 


9054 
4443 


100 
100 


595 


AL13 7802 


Homo sapiens 


dJ798A10.1 (novel protein) 


212 


' ss 1 


59S 


AL022329 


Homo 
sapiens 


DK407F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


$97 


AF226048 


Homo sapiens 


GL003 


2009 


99 


£98 


M\J A fO±±Z 


Homo 
sapiens) 

>Y4 9635 
OCT- 19 9 9 15- 

x tan i a a o 

Human sdp3*5 
protein. 
(Homo 
sapiens 


putative cell cycle control 
protein 


33S 


23 ■ 


599 


Y59741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 10. 


1574 


99 


600 


L36531 " 


Homo sapiens 


integrin alpha 8 subunit 


5386 


99 


601 


Y38458 


Homo sapiens 


Human secreted protein 
encoded by gene No. 20. 


895 


100 


602 


AF218584 


Homo sapiens 


~5gaT — 


3265 


100 


603 
604 


V13115' 


Homo sapiens 
Homo sapiens 


serine /threonine protein 
kinase 

dJ393D12.1 (KIAA0776) 


5071 


99 


605 


AL024452 


Homo sapiens" 


dJ682J15.1 (novel Collagen 
triple helix repeat 
containing protein) 


2413 
1979 


99 
100 


606 


Y14494 


Homo sapiens 


aralarl 


34S5 


99 


607 


AJ001981 


nuiBU Sapiens 


UAAJ.Jj 


2603 


100 


608 


X8 6098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 B1A protein 


306*9 


100 


610 


AF163 572 


Homo sapiens 


synthetase 


1865 


99 


611 
612 


AF161503 
L41834 


Homo sapiens" 
Bnsis minor 


HSPC154 

nuclear protein 


1261 


97 


Tl3 


Y919S4 


Homo sapiens 


Human cytoskeleton associated 
protein 9 (CYSKP-9) . 


34* 
3668 


30 
100 


614 " 


At022327 


Homo sapiens 




361 


94 


615 


X8S786 


Homo sapiens - 


banding regulatory factor 


3203 


100 


616 
617 


Y08319 
D12644 


Homo sapiens 
Mus musculus 


kinesin-2 
KIF2 protein 


3487 
"3609 


99 
97 


618 
419 


U287S9 
Y35914 


Mus musculus 
Homo sapiens 


Extended human secreted 
orotein ft prm ortr^ cert t n »m 
163. 


5936 
1684 


89 
99 


620 
6*21 


AB046382 


Mus musculus 


testis— afciunrfa nt- finnov 

protein 


199 


23 


622 " - * 


Y00062 
AF068286 


Homo sapiens 
Homo sapiens 


to 1120) 
HDCMD38P 


- 

3440 

861 I 


99 


623 

£24 " " 


X9824S 


Homo sapiens 


sortilin 


4436 


100 
99 




X61100 


Homo sapiens " 


75 JcDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S58544 


Homo sapiens 


75 Jtda infertility- related 
sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens" 


HSPC193 


582 


93 


627 
S'28 


X1496B 


Homo sapiens 


Rll-alpha subunit (AA 1-404) 


2079 


100 




Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7 1 derived protein 1 


1983 j 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


l r 

IDENTITY 


629 


Y50911 


~ Homo sapiens 


vb7_l derived protein 


1694 


100 


630 


AF098786 


Homo 
sapiens 


17 beta-hydroxysteroid 
dehvdroqenase tvne vti 


1^54 


100 


631 


AL034555 


Homo 
sapiens 


dJ134019.3 (zinc finger 
protein 151 (pHZ-67) ) 




100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


96 


633 


AF288288 


Homo sapiens 


HPT protein 


' 223* 


.100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Homo sapiens 


kinase 


1589 


100 


635 


Y11284 


Homo sapiens 


AFX1 


2572 


98 


637 


AR0048B4 


Homo sapiens 


PKU-alpha 


J /lc 


99 




AJ002303 


Homo sapiens 


cynaptogyrin lc 


1020 


100 


639 


AJ002304 






1002 


100 


640 


AJ002303 


Homo sapiens 


synaptogyrin lc 


933 


94 


641 


D87682 


Homo sapiens 


ti mi j. i ar to a l, eie^ans 
T26A5 . 


2676 


100 


642 


M14660 




ISG-K54 **" 


2473 


99 


643 


X06661 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 


AFJL19900 






185 


76 


645 


AB031048 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86G91 




Mi- 2 protein 


10110 


99 


648 


Uf£7934 


Homo sapiens 


44.9 kDa protein C18B11 
homo log 


827 


96 


649 


AF236 061 


Oryctolagus 

LtUJIlLUXUS 


RING- finger binding protein 


3330 


91 


650 


AL034553 


Homo sapiens 


dJ9l4P20.2 (KIAA0784 protein 
similar to Mus muscuius 
activity- dependent 
uumupt^tactive pix>coxn 
(Adnp) ) 


5708 


100 


653 


X14766 


Homo saoipriQ 


uaoa-a receptor alpha 1 
subunit 


2368 


99 


654 
"655 


AC004614 


Homo sapiens 


^"ppuiiuiii proccins 
AB006086 (PID:g2529225) 


ifi^c 

J02o 


99 


656 


Y57908 
Z34975 


Homo sapiens 
Homo sapiens 


HTMPN-32. 
icucp 


cna 


99 


"659 


AL>U5Q306 


Homo sapiens 


CU475B7.2 (novel protein) j 


3733 
1942 


100 
99 




W76734 


Homo 
sapiens 


Human mDia Rho targeting 
protein. 


781 


34 


660 
661 


AP202724 
Z21966 


Homo sapiens 
Homo sapiens 


Sadl unc-84 domain protein 1 
mPOU homeobox protein 


2172 
1529 


100 

100 j 


662 
663 


AJ242954 
AF182316 


Mus mus cuius 
Homo sapiens 


dys £erl in 
myoferlin 


4752 
623 2 


99 


6te 

667 


X59303 


Arabidopsis 
thallana 
Homo sapiens 


hypothetical protein 
valyl-tRNA synthetase 


209 


30 


668 


Y1335S 


Homo sapiens 


protein PRO220 . 


J J7J 
J o 7a 


99 
100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 

beta-N-acetylglucosaminidase 

gene 


611 


b2 


671 | 


X56123 


tuis muscuius 


talm 


4474 


76 


672 


AB03 9371 


Homo sapiens 


mitochondrial abc transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCP11 


806 


42 


674 


AF229633 


Mus muscuius ~ 


groucho-related protein 4 


4 053 


99 


675 


1*14463 


Rattus 


'transducin 


3619 ' 


92 



159 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 

NO: 


Accession 

NUMBER 


SPECIES 


DESCRIPTtoK 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






norvegicus 








676 


AC005757 


Homo sapiens 


R32611 1 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
homolog=pol {retroviral 
element} 


252 


65 


678 


AF271388 


Homo sapiens 


CMP-N-acetyineuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo sapiens 


ERF-i 


1783 


100 


680 


AF118566 


Mus musculus 


hematopoietic zinc finger 
protein 


769 


50 


681 


Y5141* 


Homo 
sapiens 


Human wild type pKe83 
protein. 


2621 


99 


682 


AL133S45 


Homo sapiens 


DA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase ) 


700 


68 


683 


Y86214 


Homo sapiens 


Nuclear transport protein 
clone hfb341 protein 
sequence. 


5888 


99 


684 


Y94952 


Homo sapiens 


Human secreted protein clone 
fhll6_ll protein sequence 
SEQ ID NO: 110. 


354 


9e 


685 


AL021878 


Homo sapiens 


dJ257I20 . 4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isoform 2) ) 


154 


67 


686 


AE000198 


Escherichia 
coli 


orf , hypothetical protein 


628 


100 


687 


M58378 


Homo sapiens 


synapsin I 


3730 


9S 


688 


AF03 9697 


Homo sapiens 


antigen NY-CO- 31 


508 


98 


689 


U09355 


Oryctolagus 
cuni cuius 


protein phosphatase 2A1 B 
gamma subunit 


23S6 


99 


690 


AF155106 


Homo sapiens 


NY-REN-36 antigen 


265 


50 


691 


AC004774 


Homo sapiens 


Dlx-5 


1542 


100 


692 


X90530 


Homo sapiens 


ragB 


1926 


99 


693 


X90530 


Homo sapiens 


ragB" 


1405 


99 


694 


X90530 


Homo sapiens 


ragB 


1590 


85 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 5644. 


330 


100 


696 


AC011810 


Arabidopsis 
thaliana 


putative methionine 
aminopeptidase 


669 


52 


697 


AJ250425 


Rattus 
norvegicus 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma- 1 


5364 


99 


699 


Y99401 


Homo sapiens 


Human PR01327 (UNQ687J amino 
acid sequence SEQ ID NO: 21 8. 


1386 


100 


701 


AF221712 


Homo 
sapiens 


Smad- and Olf- interacting 
zinc finger protein 


6705 


100 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 




704 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchral. 


1697 


94 


705 


Y71262 


Homo sapiens 


Human chondromodulin-li)ce 
protein, Zchml . 


1736 


99 j 


706 


Y41257 


Homo sapiens 


Amino acid sequence or long 
human FAIM. 


1060 


100 


707 


AL022237 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
c. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


708 


AJ006266 


Homo sapiens 


AND-l protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


777 


99 


710 


Y08693 


Homo sapiens 


ranbp3 


2649 


9a 


711 


Y68770 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-2. 


754 


99 
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SEQ 
ID 
NO: 


NUMBER 




L>£*z>i*KI i*TI ON 


SMITH - 
WATERMAN 


IDENTITY 


7X2 


U93574 


Homo sapiens 


putative pl50 


799 


£9 


71* 


ACO04531 




box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


716 


XL137013 


Homo sapiens 


bA311P8.3 (probable uracil 
phosphoribosyl tran f erase ) 


862 


100 


717 


AB035123 


Mus mus cuius 


GDI alpha/GTla alpha/GQlb 
alpha synthase 


1696 


93 






nOmO >r4U^b4 

OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human igfam-2 immunoglobulin. 


2345 


85 


719 


x<m79 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W41S65 


Homo 

sapiens] 

>WaiS64 

W41564 08- 

OCT-1997 05- 

APR-1996 

Human 

calpain . 

[Homo 

sapiens 


Human calpain. 


1591 


99 


723 


AF161341 


Homo sapiens 


HSPC078 


1037 


98 


724 


AP187318 


Homo sapiens 


F-bcoc protein Fbx2 


1607 


100 


725 


AC006708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


1143 


46 


726 


AC006708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72B76) 


988 


46 






Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF0O4O0 (WD domain, 

E«1.4e-20, N=3 


950 


44 j 


728 


AJ005897 




0M5 


831 


47 


729 


Y4S377 


Homo sapiens 


Human secreted protein 

f y~7)CTmoT> t* pnnnHor) fv^m nana 
j-— etyiuttiiu t5iii»Ljaeu £cdiq gene 

27. 


908 


"97 


73 0 


G039"3i 




ID NO; 8012- 


578 


100 


731 


AB012720 


On c orhyn chu s 
ma sou 


GTP-binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 

CULUUBu Dy vane MO . t) . 


862 


97 


7^3 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


97 


734 


AC024813 


caenorhabdit 
is elegans 


Hypothetical protein 
Y54Fl0AL.a 


152 


24 


735 


AL035461 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohoY 
phosphatidyltransf erase 
family member protein) 


1562 


98 


736 


U00033 


caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF079098 


Homo 
sapiens 


arginine- tRNA-protein 
transferase 1-lp; ATEl-lp 


2733 


99 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 






SMITH - 
WATERMAN 


IDENTITY 


739 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 






100 


73 9 


AJ133115 


Homo sapiens 


rsc-22-like protein 


2054 


99 


740 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 




100 


741 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


^4 


74 


742 


U97191 


Caenorhabdi t 
is elegans 


sub-fatnilv of oac rvmhoinn 

*«m**y A/w c Ails? 


Q£ ft 

you 


85 


743 


X76057 


Homo sapiens 


phosphomannose isoroerase 


2191 


100 


744 


G03209 


Homo «5aoipns 


ID NO: 7290. 


496 


98 


745 


X97064 


Homo sapiens 




— — 

4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 


994 


100 


747 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


748 


M19529 


Sus scrof a 


coinstatin A 


1906 


98 


749 


JvJ ZUho 1 


Trichomonas 
vag ina_ i s 


centrin, putative 


183 


28 


750 




Homo sapiens 


EOS39554 1 


2094 


100 


751 




Homo sapiens 


p4 7iNQ3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 






Homo sapiens 


pnospholysme 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


"7S4" 


D79205 


Homo sapiens 


ribosomal protein L39 


160 


77 


755 


"OU U OtJU 


Homo sapiens 


CDEP 


142 


29 


75B 


L321*2 


Homo sapiens 


transcription factor 


574 


80 


759 


AT037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


/ ou 


Y44«SQ 


Homo 
sapiens 


Human cell signalling 
protein- 13 . 


625 


100 


f o JL 


AF218b8b 


Homo sapiens 


Cide -b 


1136 


100 


762 


U38934 


Qallus 
gallus 


his tone H2A 


625 


97 






Homo sapiens 


HSKM-B 


606 


32 


7*4 


X13403 


Homo sapiens 


Oct-l protein (AA 1 - 743) 


3£2£ 


100 


7bD 


D87446 


Homo sapiens 


Similar to a C.elegana 
protein encoded in cosmid 
C27F2 (1740419) 


568 


38 


76 6 


nr mi in a 


Caenorhabdi t 
is elegans 


Y17G7B.14 


200 


27 


767 


Y82777 


Homo sapiens 


Human chordin related protein 
\ Li one dwboD *) . 


2^1 


99 


768 


X92475 


Homo sapiens 


1TBA1 


1429 


100 


769 






Human calcium binding protein 
3 fCaBP-3) 


1426 


100 


770 


X51416 


Homo sap i ens 


521) 


«o4 1 


97 


771 


AJ006591 


Homo sapiens 


+*y s> tts jLiic— x Jiwn pioucxu 


1793 


100 


772 


A08695 


Homo saoi^n^ 




935 


100 


"773 


Z12173 


Homo sapiens 


N—acetyl glucosamine- 6 — 
sulphatase 




1 ft ft 


774 


Y91950 


Homo sap i ens 


Human cvtoaicelphon aqqnr^ahPH 

protein 5 (CYSKP-5) . 


DO 3 


43 


776 


ALQ23799 


Homo sap 1 en s 


uu J£ i » JL I u Alii* iiiiLjcL^ 


855 


56 




AL023799 


Homo sapiens 


dJ322P7.l (zinc finger) 


85-5 


56 


77B 


G01880 


Homo sani 


ID NO: 5961. 


B4 9 


98 


779 


AJ012590 


Horao sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078582 


Homo sapiens 


dJ130E4.2 (KIAA0796) 


1321 


66 


781 1 


Z75955 


Caenorhabdi t 
ia elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJ1121G12.2 (SCAN domain- 
containing l protein) 


900 


100 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


7d4 


S03873 


Homo sapiens 


Human secreted protein, SEQ 


649 
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TABLE 2 



PCT/USUO/34263 



SEQ 
ID 
NO : 


ACCESSION 

MTTMDDT5 

NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








t0 fcO: 7954. 






7DC 
/ 1>3 




Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2074 


100 


786" 


V0fl9i8 


Homo sapiens 

- 


Human Rab protein, RABP-l, 
protein sequence . 


1048 


99 


to/ 


A? / vcy 


Homo sapiens 


ribonuclease HI large subunit 


1548 


99 ■ 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


" 962 " 


S4 


789 


A r U ^ 4 b J X 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ006710 


Rattus 
norvegicus 


phoaphatiayl inositol 3 -kinase 


4508 


97 


792 


V0063B 


bacterlophag 
e lambda 


reading frame eaio 


600 


100 


793 


AP049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein-7sequence . 


5080 


99 


797 


U15155 


Gallus 
gallue 


trypsinogen 


372 


37 


798 


U971B9 


Caenorhabdit 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP25 


1053 


100 


800 


AF234765 


Rattus 
norvegicue 


serine -arginine- rich splicing 
regulatory protein SRRP86 


958 


63 


801 


AF267852 


Homo sapiens 


placental protein 13 -like 
protein 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


HO 


803 


Z81097 


Caenorhabdit 
is elegans 


Similarity to Human 
retinoblastoma -binding 
protein RBAP46 yk662dl2.5 
comes from this gene 


152 


27 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6194. 


496 


98 


805 


AL121673 


Homo sapiens 


bA305P22.1 (novel protein) 


1160 


ICO 


806 


AC013483 


Arabidopsis 
t ha liana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


3C 


808 


AB013B85 


Homo sapiens 


beta-ureidopropionase 


1494 


100 


809 


AF078B42 


Homo sapiens 


HOTTL protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC3 03 


2134 


si 


811 


AF261S89 


Homo sapiens 


DNA polymerase epsilon p!7 
subuni t 


734 


100 


812 


Z7402$ 


Caenorhabdit 
is elegans 


Similarity to c. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 

. 


813 


273497 


Homo sapiens 


CU240C2.2 (Core histone 
H2A/H2B/H3/H4) 


324 


100 


814 


no roof 


Homo 
sapiens 


Human HTXFT19 polypeptide. 


1484 


99 


815" 


Alt) ZB« 


riomo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon> 


1109 


99 


R1 £ 

oio 




Mycobacteriu 
m 

tuberculosis 


pth 


300 


36 


818 


AB030483 


Mus mus cuius 


B9 


197 


27 


919 


AL117S55 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660__2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SRQ 
ID NO: B032. 


700 


99 


821 


1,34807 ■ 


Musca 
domes tica 


transpoaace 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


624 


299531 


Schlzosaccha 


cacfeine- induced death 


184 


29 
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TABLE 2 



PCT/USOO/34263 



SEQ 
1 ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 


k 

IDENTITY 






~ romyces 
oombe 


protein l 






825 


AJ006692 






693 


68 ~~ 


826 


U23037" 


Oryctolagus 
cuniculus 


eIF-2Bepsilon 


340S 


90 


827 


G03412 


Homo sapiens 


ID NO: 7493. 


AC A ~~ 


100 


B28 


Y30327 




encoded from gene 17. 


113 


44 


~829 


Y32199 




numau LCLepLUL moiecuie IKfitU/ 
encoded by Incyte clone 
2022379. 


1012 


100 


830 


W7B279 


Homo sapiens 


Fragment of human secreted 
protein encoaea ny gene 33. 


1264 


99 


832 


AB011542 




rlEt\3 C 3 


2097 


100 


833 


G62639 


Homo sapiens 


Human secreted protein, SEQ 


223 


70 


834 


AF119664 




transcrxptional regulator 


1574 


100 


835 


AF119664 


Homo sapiens 


Lidnscripcionai regulator 
protein HCNGP 


1144 


89 


836 


AF119664 




transcriptional regulator 
protein HCNGP 


1448 


94 


837 


•X12517 




proLein (AA 1-159J 


918 


100 


838 


U32865 


Drosophila 
melanogas ter 


Irnotte protein 


164 


24 


839 


AF067730 ™ 


Homo sapiens 


TIjS- associated protein TASR-2 


631 


56 


840 


U27831 


Homo sapiens 


striatum- enriched phosphatase 


2840 


98 j 


841 


£0 OJDO 


Homo sapiens 


CamKI-like protein kinase 


1796 


ioo ! 


842 


G023 09 


Homo sapiens 


Human secreted protein, SEQ 

JLU NU: oJ90. 


27B 


98 


843 


AE003615 


me 1 a n oga s t e k 


ade3 gene product 


113 


48 


844 


G01350 


Homo sapiens 


Human secreted protein, SEQ 


629 


1C0 


845 


U27838 


nus mus cuius 


gjuycosyi -pnospnaciQyjL- 
inositol -anchored protein 
homo log 


3305 


96 


847 


YB7788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


848 


AP164794 






2398 


100 


849 


U41315 


Homo sapiens 


ZNF127-Xp 


2458 


93 


850 


AF192784 






2062 


97 


851 


. Y58423 


Homo sapiens 


Protein regulating gene 
expression PRGE-21 . 


154 B 


100 


852 


Z22968 


Homo sapiens 


M130 antigen 


6205 


100 


853 


Z22971 




i'jxju cjiiuxtjtsn ejccracej.jLUJ.ar 
variant 


6380 


100 


-8S4 


G03362 


Homo sapiens 


Human secreted protein, SEQ 

ID NO- 74 45 


330 


96- 


-855 


~G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443 . 


"203 


100 


856 


AF285118 


Homo sapiens 


CGI-203 1 " — ~ 


452 


100 


857 


AC006069 


Arabidopsis 
thaliana 


Dutativp rlpavaao an/4 

factor 


1383 


""ce ' 

55 


858 


AL02154* 


Homo sapiens" * 


Polypeptide Via -liver 
precursor {EC 1.9.3.1) 


c a~i 
b 93 


100 


859 


L02956 


Xenopus 
laevis 


ribonucleoorotein 


1664 


85 


860 


AF201947 


Homo sapiens " 


MEK binding partner 1 


616 


100 


861 


L31783 


mus muscuius 


uridine kinase 


1266 


92 


862 


AF161472 


Korao sapiens 


HSPC123 


602 


73 


863 


Z49Q68 


Caenorhabdit 
is elegans 


mitochondrial carrier protein 


370 


43 


864 


AF154108 


Homo sapiens 


tumor necrosis factor type 1 


3559 


99 



164 
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TABLE 2 



SEQ 
10 

XT pi . 


ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION 
"receptor associated protein 


" SMITH- 
WATERMAN 
SCORE 


IDBNTITY 


865 


AE001530 


~ Helicobacter 
pylori J99 


putative 


230 


32 


866 


X57807 " 


Homo sapiens 


immunoglobulin lambda light 
chain 


OS* y 


91 1 


-867 
868 


' AL031673 - 
Y11652 


" Homo sapiens 
" Homo sapiens 


d-J694314.1 (PUTATIVE novel" 
KRAB bcoc protein with 18 C2H2 
type Zinc finger domains) 
phosphate cyclase 


4066 


99 


863 
'BlO 


AF192968 " 


Homo sapiens 


nign-giucose- regulated 
protein 8 


"3041 


100 
99 




AB020648 


Homo sapiens 


KIAA0841 protein 


3237 


S9 


871 


AIi031427 


Homo sapiens 


CM167A19.1 (novel protein) 


1608 


100 


372 


AF151534 


Homo sapiens 




1866 


ioo 


873 


AL021331 


Homo sapiens 


dJ3S6N23.1 (putative Cf. 
elegans UNC-93 (protein 1, 
C46F11.1) I,IKE protein) 


1129 


100 


074 

875 " 


X14 608 




p-topionyj.-v.oA carboxylase 


3579 


100 




AL117334 


Homo sapiens 


" dJ6B7Fli.i (novel protein 
(part of translation of cDNA 
<<p«ij4wubi, am:ALill0249) ) 


306 


100 


B76 


X79489 


Sac char omyce 
a cerevisiae 


E-925 protein 


446 


35 


877 
878 


Y53D01 
AF231064 ' 


Homo sapiens 


Human secreted protein clone 
dn834 l protein sequence SEQ 
ID NO: 8. 

CKMP1.5 


811 
957 


100 
100 


879 
890 


X79417 
AF001317 


Sua scrofa 
Sac chaxomy ce 
s cerevisiae 


4 OS ribosomal protein S12 


687 
478 


100 
28 


881 


Y8727S 


Homo sap i. ens 


niuiiciii SlyuaJ. pSpu^lClC 

containing protein HSPF-52 
SEQ ID N0:52. 


2547 


100 


8 82 


M14036 


Homo sapiens 


Cl-inhibitor 


598 


77 


883 
884 


AB041261 
■AF0203I3 


Homo sapiens 
Mus mus cuius 


calcium- independent 
phospholipase A2 


2903 


100 


885 


Y1093* 


Homo sapiens 


yiuAinc-rica protein 9 8 
hypothetical protein 


999 
1104 


84 
99 


886 


AFQ73997 


Mus mus cuius 


myotubularxn related protein 


866 


' 36 


887 
88k 


Y57893 
AL117635 


Homo sapiens 


Human transmembrane protein 
HTMPN-17. 


1099 


94 


889 


AF210317 


Homo sapiens 


nypocneticai protean 
facilitative glucose 
transporter family member 
GLUT 9 


929 
2046 


99 
99 


890 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


593 


loo 


891 


Y36031 


Homo sapiens 


protein sequence, SEQ ID NO. 
416. 


192 


57 


892 
893 


AF2*76'31 - 
AF090929 


Homo sapiens 
Homo sapiens " 


ubiquitous tropomoduiin O- 

Tmod 

PR00477p 


1798 


166 


894 
89* 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 

BING4 (similar to S» 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


653 
3196 


99 
100 


896 


AL0il228 
AF171102 


Homo sapiens 
nomo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar Co S. 
cerevisiae YBR082C, M. sexta 
MNG10 and C. elegans F2BD1.1) 
retxnal degeneration B beta 


2825 


96 


897 


AE6035S1 


Drosophila 
melanogaster 


CGI 8176 gene product 


1302 
633 


95 
33 



165 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SBQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


L>£*0 UKlr X X\Jri 


SMITH— 
SCORE 


% 


838 


AJ23794 6 


Homo sapiens 


DEAD Box Protein 5 


2443 


" 100 


899 


Z97184 


Homo sapiens 


EKE2 


624 


100 


900 


Z97184 


Homo sapiens 


KKE2 






70 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP-bindina orotein RAB22A 




AUU 


903 


R959S3 


Homo sapiens 


Eukaryotic cell growth 

"tnh hi t" "Lnci f arhoT 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


melanogaster 


HQRA nana nvnr4nn1* 
LUlUJDt y If 11CS UJLUUUUL 


44 6 


3 3 


906 


1455542 


nuiuu sapiens 


guanylate bxnding protein 
isoform I 


2993 


98 


?U f 




Homo sapiens 


guanylate binding protein 
isoform I 


2901 


96 


908 


> v o t u a j 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


1B89 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 


100 ; 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


521 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243. 


387 


87 


913 


AJ243721 


Homo 
sapiens) 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . [Homo 
sapiens 


dTDP-4-keto- 6 -deoxy-D- glucose 
4-reductase 


1710 


100 




U24189 
. . . 


Caenorhabdit 
is elegans 


hypothetical protein 1207- If 
Method: conceptual 
translation supplied by ■ 
authors 


244 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


916 


AE000984 


Archaeoglobu 
s fulgidus 


dinitrogenase reductase 
activating glycohydrolase 
(draG) 


171 


26 


913 


M23159 


Crice tus 
cricetus 


DHFR-coamplif ied protein 


163 


30 


919 




LacuomdDQj. c 

is elegans 


put a t ive 


1232 


41 


920 


AF102177 




tumor anexgen oX»r / — op 


1260 


97 


921 


AL096712 


Homo qani pnet 


novel human gene mapping to 

Ac h i va t air \ 


1017 


78 


922 


AL161495 


Arabidopsis 
thai iana 


putative WD- repeat protein 


866 


42 


923 


AL161495 


Arabidopsis 
tha liana 


putative WD- repeat protein 


442 


3£ 


924 


U97001 


Caenorhabdit 
is elegans 


similar to 


605 


51 


925 


X71978 


Mus musculus 


Fi£ 


1503 


95 


92* 


K92288 


Drosophila 
melanogaster 


beta-spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No. 9. 


1392 


100 " 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_l. 


2249 


100 ! 


93 0 


AJ224326 


Homo sapiens 


ribulose-5-phosphate- 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


S5 
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TABLE 2 



SEQ 
ID 
NO: 


Accession 

NUMBER 


SPECIES 
is elegans 


DESCRIPTION 

cm21c7 


smith!- 
waterman 

SCORE 


% 

IDENTITY 


932 
933 


AL080065 
G01384 


Homo sapiens 
Homo sapiens 


hypothetical protein 

Human secreted protein, SEQ 

ID NO: 5965. 


210' 
767 


A3 

i — i --- — ■ 

70 


934 


AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


1200 


JLUU 


935 


AL035681 


Homo sapiens 


dJ756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 


1142 


Art 
OU 


93 S 


AB02G808 


Mus rous cuius 


synaptotagmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


HRIHFB2216 


260i 


99 | 


938 


X65724 


Homo sapiens 


0RF2 


498 


100 


939 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156. 


1487 


100 


540 


G04047 


Homo sapiens 


Human secreted protein, SEQ 

ID NO* 8198 


117 


100 


941 


AF094S83 


Homo sapiens 


putative Hiv-i infection 
related protein 


452 


100 


942 
943 


AC024200 
AF129756 


Caenorhabdit 
is elegans 

Homo sapiens 


contains similarity to 
several zinc finger proteins 

but not to tY\f* -l nr- f ■( Yvror 

uvjK. \-\-f uiic £tXii\~ Linger 
domains 

G5C 


350 


69 


944 


K23765 


kattus 
norvegicus 




273 
133 


100 
96 


945 
946 


AC009917 
AF223468 


Arabidopsis 
thaliana 
Homo sapi ens 


v-uiiuaiua fisiuixxanty CO 


583 


47 


947 
948 


AF055473 
X7S75£ 


Homo sapiens 
Homo sapiens 


GAGE -8 

protein kinase C mu 


551 
273 
2019 


44 
51 
68 


949 
950 


AF1439S6 
Y36729 


mus musculus 

Homo 

sapiens 


corcnin-2 

Human PG1 protein sequence. 


2300 
1861 


93 
99 


951 


W49041 


Homo sapiens 


Human low density lipoprotein 
binding protein LBP-2. 


232 


67 


952 


AB016881 


Arabidopsis 
thaliana 


gene^id : MXC17 . 7 - 


203 


46 


953 


^01785 


Homo sapiens 


Human ubiquit in- conjugating 
enzyme >Y2S341 Y25341 01-JUL- 
1999 12 -AUG- 199 8 Uuimn urv- o 
protein. 


365 


100 


"954 
955 


AF145615 
U09410 


Drosophila 
melanogaster 
Homo sapiens 


BCDNA.GH03377 

zinc finger protein ZNF131 


823 
2483 


46 
"99 


956 


U09410 
AF195623 


Homo sapiens 
Homo sapiens 


zinc finger protein ZNF131 
chol inephosphotransf erase 1 
alpha 


1853 
2126 


99 
39 


958 


X94917" 


Drosophila 
melanogaster 


head-elevated expression in 
0.9 leb 


155 


32 


959 
960 


U54807 
AF058807 


Rattus 
norvegicus 
Bos taurus 


GTP-binding protein 
GTP-binding protein rah 


1167 


97 


961 
962 


G03244 
AF078B50 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7325. 


606 
471 


97 
100 


"963 


APO 01754 


Homo sapiens 


transient receptor potential- 
related channel 1, a novel 
putative Ca2+ channel protein 


583 

317 r 


4 0 

30 


964 


AL03S419 


nomo sapiens 


dJll00H13.1 (putative novel 
protein) 


1129 


100 





X61381 


Rattus 
rattus 


interferon- induced protein 


202 


46 




D38169 


Homo 
sapiens 


inositol 1,4,5-trisphosphate 
3-kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 
sapiens 


dJ4«N24.2.X (PUTATIVE novel " 
protein) (isoform 1) 


893 | 100 
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TABLE 2 



PCT/USOO/34263 



ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRI PTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


968 


U79275 


Homo sapiens 




oil 


100 


969 


AJ0L13 06 


Homo 
sapiens 


guanine nucleotide exchange 
factor flnna 1 Qnfnnn \ 


2752 


99 


970 


AF281134 


Homo sapiens 




-- 

1186 


100 


971 


U53336 


Caenorhabdit 
is elegans 


weak similarity over a ohort 

*cyigu iiiyvJoin lied Vy CllalD 


536 


23 


972 


AC018749 


Le i s hmani a 
major 


LB 04 0 1? 


589 


53 


973 


AF188504 


Mus musculus 


LNV 


544 


85 


974 


U25801 


Homo sapiens 


T^y1 Ki nHi nn r\ vs-\ hoi n 


852 


98 


975 


AP049523 


Homo sapiens 
1 


hunting tin- interacting 


1390 


97 


976 


AF161530 






1040 


100 


977 


G04G20 


Homo sapiens 


Human secreted protein, SEQ 
ID NO! 8101. 


626 


100 


978 


AFl 64797 


xioiuo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


U94991 


Xenopus 
laevis 


transcription factor XLM01 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsequestrins 


2029 


ioo 


981 


V Q n Q a a 

i y4 Bob 


Homo 
sapiens 


Human protein clone HP 014 62 . 


2501 


100 


982 


AJ243191 


Homo sapiens 


heat shock; protein 


827 


96 


983. 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 


984 


AJ249207 


Rhodoccccus 
Sp. AD45 


putative racemase 


351 


43 


985 


Z30093 


Homo sapiens 


basic transcription factor 2, 
35 kD subunit 


1576 


99 


986 


AJaU3U 83d 


Homo sapiens 


contains two glutamine rich 
domains , three zinc- finger 
domains, and matrin 3 
homologous domain 3 (MH3) 


4697 


99 


987 


AF227258 


Bos taurus 


RPGR- interacting protein-1 


1262 


38 


988 




Homo sapiens 


dJ1042Kl0.2 (supported by 
GENSCAN, FGENES and GENEWISE) 


4048 


99 


989 


™jUii! i Jo 


Komo sapiens 


dJ1042K10.2 (supported by 
GENSCAN, FGENES and GENEWISE) 


2321 ; 


99 


990 




nu iiwj &>ctp i ens 


oSPCJUo 


448 


52 


991 


AF161426 


Homo sapiens 


HSPC308 


44B 


92 


"992 


AFl 6 14 26 


Homo sapiens 


H5PC308 


453 


92 


993 


AL023859 


Schizosaccha 

romyces 

pombe 


trna- splicing endonuclease 
subunit 


172 


42 


994 


AL049631 


Homo «ian i P>n <s 


aui>i3M>.i inovel Homeobox 
domain protein) 


241 


47 


995 


AC005253 


Homo sapiens 


R26445" 1 


902 


100 


996 


AF265206 


Homo sapiens 


M0G1 isoform A 


974 


100 


997 


AJ248285 


abyss i 


sar cosine oxidase, subunit 
beta (soxB) 


195 


28 


998 


AE003641 


inelanogas t er 


su:uauu?4i,i gene product 


218 


58 


999 


W69343 


Homo 
sapiens 


Secreted protein of clone 


1340 


98 


1000 


AY007135 




similar to bovine ADP/ATP 
translocase Tl mRNA with j 
GenBank Accession Number 
M24 102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 J 


1003 i 


AE004944 


aeruginosa 


hypothetical protein 


134 


35" 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


100 


1005 


S45367 


Can is 
familiaris 


centractin 


1949 


100 



168 
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TABLE 2 



PCT/US00/34263 



$EQ 
ID 
NO: 


ACCESSION 
NUMBER 


sTecTes — 


DESCRIPTION 


WAi h-KMAN 
SCORE 


* 

IDENTITY 


1006 


S4536^ 


Canis 

farailiarie 


centractin 


1315 


JO 


1007 


AB022158 


Mus 

mus cuius 


chaperonin containing TCP-1 
epsilon subunit 


2649 




1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppei-type zinc finger 
protein 


1671 


58 


101Q 


Z58218 


Caenorhabdit 
is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


G02841 




ID NO: 6922. 


JJ2 


93 


1014 


AF14S659 " 


Drosophila 
melanogaster 




1244 


52 


1015 


Y02860 




Fragment of human secreted 
tr** chwddsq j gene ob . 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 

control p v 1 i Vq v-nt- a A « 


772 


97 


1017 


Y99448 


Homo sapiens 


Human PR01759 <UNQB32) amino 
acia scCjUencc It* NO: 374. 


2323 


100 


1018 


X67250 


Rat tus 
norvegicus 




1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule- associated 
proteins 1A/1B light chain 3 


6*31 


100 


1020 


AF154795 




sox- regulated protein j anus -a 


674 


100 


1021 


AF130625 




qaga-i 


638 


96 


1022 


AL133363 


Arabldopsis 
fchaliana 


putative protein 


155 


37 


1023 


AB034912 


Homo sapiens 


wd- repeat like sequence 


2493 


100 


1024 


AY007O91 




similar to Homo sapiens 
mammalian inositol 
jitsAajtibjjaoopiicluB Jvinase z. 
(IP6K2) mRNA with Ge 


2243 


100 


1025 


X69910 




P63 orotpin ~ ' 




99 


102* 


U8073^ 


Homo sapiens 


CAGP9 


1657 


100 


1027 


AB029333 


Ha locyn t h ia 
roret zi 




1048 


54 


1028 


AB032931 


Homo sapiens 


ubiquitxn- conjugating enzyme 
isolog 


1045 


106 


1029 


G01797 




ID NO: 5878. 




98 


1030 


G01797 


Homo sapiens 


ID NO: 5878. 




98 


1031 


AF193795 - 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 




100 


1032 


AJ222968 


Mus mus cuius 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha. 

romyces 

pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


1034 


Y41519 


Homo sapiens 


Fragment of" human secreted 
protein encoded by gene 75. 


1*71 


~5o 


1035 


AJ276004 


Mus mus cuius 


Paxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdit 
is elegans 


H14A12.3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc linger 
protein; this is a splicing 
supplied by author | 


196 


43 


1038 


W74580 


Homo 
sapiens 


Human membrane protein 
BA03O6. 


X921 


97 


1039 


U8 8173 


Caenorhabdit 
is elegans 


weak similarity to ] 
Arabidopsie thaliana 
ubiquitin-like protein 8 


331 


80 



169 
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TABLE 2 



SBQ 
ID 
NO: 


Accession 

NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


% 

IDENTITY 


1040 


AF23Q204 




DOK1 


1637 




1041 


Y96730 


Homo 
sapiens 


PR0539, a Costal -2 homologue. 


162 


22 


1042 


AF140683 


Mus musculus 


F-box protein FWD2 


2397 


■"or 


1043 


AF151023 


Homo sapiens 


HSPC1B9 


1164 


100 


1044 


AF181631 


Drosophila 
n\e 1 a Tioga s t 


BcDNA. GH04929 


204 


37 


1045 


Y77985 


Homo sapiens 


Human collefh^n ami r»r» an' ri 
sequence . 


1 94 0 


100 


1046 


AJ243972 


Homo sapiens 




1317 


100 


1047 


A3035863 " 


Homo sapiens 


synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034550 


Homo sapiens 


dJ1184F4.2 (novel protein 
oAiii*i.jLcti tu Hue xeoj. ci it pirocdn 
4 (N0L4) (NOLP)) 


981 


92 


1049 


AF163 82S' " 


*4rtlftrt cam pna 


pAt; o xyuiptiocyte protein j 


634 


100 


1050 


AF201949 


Homo sapiens 


60S ribosomal protein L30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl-1 


236 


85 


1052 


AE003529 


Drosophila 
melanogasfc er 


CG6151 gene product 


160 


44 


1053 


G0H91 


Homo sapiens 


Human secreted protein, SEQ 
ID HO : 5272 . 


646 


98 


10*4 


Abl62756 


Neisseria 
meningitidis 


GlU-tRNA(Gln) 

amidotransf erase subunit A 


682 


44 


1055 




J?attus 
norvegicus 


tRNA eelenocysteine 
associated protein 


1525 


99 


1056 




Chi amydomona 
s 

reinhardtii 


Mrl9,000 outer arm dyne in 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 " 


1058 


AF230929 


Homo 
s&pxens 


keratinocyte annexin-like 
protein pemphaxin 


1710 


99 


1059 


AJ270952 


nuiuu SrOp JLBitii 


putative membrane protein 


1363 


100 ~ 


1050 


AF224263 


Heterodontus 

Am »- 91 t V* M mm* ml. 


HOXDB 


742 


83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL07934S 


streptomyces 
coel i color 
A3 (2) 


hypothetical protein 


143 


27 


1063 


Y71112 




njiiwn nyuLUidSS prOt61Il"H/ 

(HYDRL-10). 


2547 


100 


1064 


AF263614 






3493 


99 ' 


1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
orotein PROMPT 


13*3 


100 


1066 


AC006153 


Homo sapiens 


similar to Aqui~ex aeolicus 

GTP— bindinci nrnhft-inj ciimHlav 

to AE000771 (PlD:g2984292) 


662 


98 


1067 


Y18930 


Sulfolobus 
solf atari cus 


hypothetical protein 


162 


29 


1068 


R65969 


Homo 

sapiens T98G 


Glioblastoma-derived 
polypep t ide . 


_ 887 


100 


1069 


Y07964 - 




Human secreted protein 
fragment 


ODJ 




1070 


AF1 77476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


86 


1071 


AF245505 


Homo sapiens 


adlican 


3109 


99 


1072 ■ 


U92794 


Mus musculus 


alpha glucogidase II, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID WO: 7970. 


698 


98 


1074 


U15779 


Homo sapiens 


p70 


3S0 


28 


1075 


Y13392 


Homo sapiens 


Amino acid sequence of 


1271 


91 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


"T SPECIES 


Uc*2> t-K lei 1 UN 


SMITH- 
SCORE 


% 

IDENTITY 














1076 


AF161457 


Homo sapiens 


HSPC339 


571 


100 


1077 


779509 




protein CRBAP-5. 


2151 


98 


1078 


AF223466 


Homo sapiens 


HT015 protein 


831 


"TF 

bo 


1075 


AL132965 


Arabidopsis 
thai! ana 


putative WD- 40 repeat-protein 


286" 


29 


1080 


AB024937 


Homo sapiens 


LUNX 


1284 


±ou 


1081 


Y14768 


Homo sapiens 


v-ATPase G-subunit like 
protein 


579 


100 


1032 


AF016416 


Caenorhabdi t 
is elegana 


F29A7.4 gene product 


141 


31 1 


1083 


1,13291 


Homo sapiens 


t\ur tiajoayicutginiiie nyo.roj.ase 


802 


45 


1084 


AB041541 


Mus musculus 


unnamed protein product 


151 


44 


1085 


G01922 


nomo Sapiens 


Human secreted protein, SEQ 


202 


97 


1086 


AB03O814 


~rj , 

Homo sapiens 


H-Kbvio7 protein homolog 


933 


100 


1087 


AF151G3S 


I luuiu sapiens 


phosphatidylcholine transfer 
protein 


1142 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence ot a 
human RNA- associated 
protein . 


2783 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone H?10563. 


S13 


100 


1090 


r.AUZJ JOC. 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein. 


606 


100 


inn-, 
-L V H J 


1734 973 


Mus musculus 


protein tyrosine phosphatase- 
like 


1131 


95 


1094 


Y66677 


Homo 
sapiens 


Membrane-bound protein 
PR082 8 . 


£22 


£6 


1095 


Y8 72 76 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 


1029 


99 


1096 


Y87276 


"W>ny sapiens 


Human signal peptide 
uuxiuaiaiiig protein nirr-bj 
SHQ ID NO: 53. 


863 


98 


1097 


AF161455 


Homo sapiens 




742 


98 


1098 


1)80029 


Caenorhabdit 
is elegana ; 


similar to thiorcdoxin 


242 


39 


1099 


AJC05866 


Homo Aani 


oyv- ' u*c protein 


1321 


99 


1100 


AJQ05865 


Homo sapiens 


Sqv-7-liJce protein 


1118 


99 


1101 


AJ005866 


Homo sapiens 


Sgv-7-lika protein 


891 


99 


1102 


AJ00586S 




Sqv-7-like protein 


1016 


99 


1103 


ALII 0244 


Homo aan i t 


nypotncu^CaX protein 


299 


31 


1104 


AF242194 


Drosophila 
malanoaaeter 


brafceless-B 


147 


52 


1105 


AL031010 ' 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 

BTfthfkl n ni lav r\ f~* &*1 ^.t— 

iJAwtcin uimiiar to ClcQSns 
C02C2.5) 


968 


100 


1106 


U28016 




oarathion kvdrolasp 
(phosphodiesterase) -related 
protein 




87 


1107 


AJ27B150 


Homo sapiens 


DUtative linid ieir*s«=ii» 




99 


1108 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID WO* 7814 


495 


98 


1109 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


il!2 


AF176704 " 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182076 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


G04039 




Human secreted protein, SEQ 


475 


96 
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TABLE 2 



SEQ 
ID 

NO: 


AWWJmo ion 
NUMBER 


SPBCIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 














1115 


AF22943 9 


Mus musculus 


zinc finger protein 289 


1697 


91 


1X16 


L40357 


Homo sapiens 


thyroid receptor interactor 


509 


100 


1117 


L40357 


Homo sapiens 


thyroid receptor interact or 


404 


85 


1118 


A12155' 


iHomo s ap i ens 


human X5L cDNA. 


1673 


100 


111S 




Arab i dops i s 
thaliana 


isomerase like protein 


607 


53 






Homo sapiens 


dJ272L16.1 {Rat 

Ca2+/ Calmodulin dependent 

Protein Kinase LIKE protein) 


2341 


98 


1121 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


321 


36 


1122 


214 122 


Xenopus 
laevis 


XLCL2 


( 455 

f 


77 


1123 


AP225418 


Homo sapiens 


lipase 


1*31 


97 


1124 


Y06518 


Homo sapiens 


Zen GTPase interacting 
protein ZIP. 


3227 


100 


1125 


AL035590 


Homo sapiens 


dJ202I21.1 (novel protein) 


952 


100 


1126 


AJOQ0217 


Homo sapiens 


CLIC2 


1286 


99 


1127 


AB030505 


Mus musculus 


UBE-1C2 


1069 


79 


1128 


Y7337S 


Homo sapiens 


HTRM clone 1427838 protein 
sequence. 


874 


100 


1129 


Y78941 


Homo sapiens 


Cyclophi-lin-type peptidyl 
prolyl cis/trans isomerase 
amino acid sequence. 


877 


100 


1130 


AL02 3553 


Homo sapiens 


dJ347H13.4 (novel protein) 


557 


100 


1131 


Y91945 


Homo sapiens 


Human chaperone protein 6 
(HCHP-6) . 


1408 


100 


1132 


Z68197 


Scnizosaccha 

romyces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


Z68197 


Schi zosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100"" 


1135 


AF079765 


Mus musculus 


enhancer of polycomb 


264 


41 


1136 


M62419 


Mus musculus 


clathrin-associated protein 


2189 


"99 


ii.37 


AJQ06219 


Drosophila 
melanogaster 


clathrm-associated protein 


1254 


78 


1138 


Y76218 


Homo sapiens 


Human secreted protein 
encoded by gene 95. 


440 


98 


113 9 


W981Q4 


Homo 
sapiens 


A Rab protein designated 
HRABS-2. 


10G5 


99 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR0339. 


3979 


98 


1141 


rlODUZo 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product. 


3309 


100 


1142 


Y 13 4 02 


Homo sapiens 


Amino acid sequence of 
protein PR0310. 


1694 


99 


1143 


G03675 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


660 


99 


1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


1096 


100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXE34) ) 


1233 


100 


1147 


AliD22157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXP34) ) 


1233 


100 


1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


3 70 


98 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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10 
NO: 


Accession 

NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








HEAAR60 . 






1151 


AF044201 


Rattus 
norvegicus 


neural membrane protein 35; 
NMP35 


1570 


92 


1152 


AF156774 


Homo 
sapiens 


lysophosphatxdic acid 
acyltransferase-gammal 


1855 


99 


1153 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR0352 protein 
sequence . 


1381 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SBQ 
ID NO: B117. 


607 


99 


1157 


AK112444 


JUupinus 
luteus 


1»- asparaginase 


287 


43 


1158 


AF15184B 


Homo sapiens 


CGI- 90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Clona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107 . 


746 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107 . 


746 


83 


1163 


AP1135^4 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AKL18501 


Homo sapiens 


dtfll9lN16.1 <A novel protein 
(translation of the CDNA 
DKFZp5S6A0946, Em: AL0500S9) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


^aJ1193N16.1 (A novel protein 
(translation of the cDNA 
DKFZpSS6A0946, Em:AL0S0069) ) 


945 


7S 


1167 


AF187733 


Homo sapiens 


syntapKilin 


831 


42 


1168 


AB019435 


Homo sapiens 


phosphol ipase 


951 




1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6. 


1191 


100 


1171 


L031B8 


Sacchoxorayce 
s cerevisiae 


putative 


180 


22 


1172 


AF113751 


Mus mus cuius 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ1042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


"28 


1176 


M35617 


Homo sapiens 


T-cell receptor V- alpha -J- 
alpha region 


284 


83 


1177 


AC012680 


Arab Ldop sis 
tha liana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


6 92 


99 


1179 


AL096767 


Homo sapiens 


~o\J579NlS.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AJF039715 


Caenorhabdit 
is elegans 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


XB2240 


Homo 
sapiens) 
>R94974 
R94974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 
polypeptide . 


T cell leukemia/ lymphoma 1 


617 


100 
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TABLE 2 



SBQ 
ID 

mn • 
wu » 


ACCESSION 
NUMBER 


SPECIES 


bESCRIPTION 


SMITH - 
WATERMAN 
. SCORE 


IDENTITY 






[Homo 
sapiens 








1183 


U42B41 


Caenorhabdit 
is elegans 


short region of weak 
similarity to collagen 


161 


33 






Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


-Li-Ob 


L27645 


Danio rerio 


growth-associated protein 


130 


36 


1187 


YQ2738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


11BB 


AF217544 


Xenopus 
laevig 


ornithine decarboxylase- 2 


1459 


60 


1189 


AL136307 


Homo sapiens 


dJ380BB.2 (Neuritin, a 
protein which promotes 
neurit e outgrowth) 


182 


"33 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


268 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-l 


1403 


60 


1193 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


918 


100 


1194 


AF026S30 


Rattus 
norvegicus 


a tathmin- like -protein splice 
variant RB3 ' * 


1093 


97 


1195 


U35244 


Rattus 
norvegicus 


vacuolar protein sorting 
homolog r-vps33a 


2981 


96 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein. 


1680 


100 


1197 


AF1S7318 


Homo sapiens 


AD- 017 protein 


912 


47 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to s. 
pombe phosphatidyl synthase 
(GB: 228295) 


460 


39 


1199 


AF201934 


Homo sapiens 


DC 12 


1649 


88 ] 


1200 


AL031775 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high-sulfur keratin 


484 


82 


1202 


Z859B6 


Homo sapiens 


dJi08Kll.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 


U187S2 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus musculus 


jerky 


2235 


it 


1205 


AB002327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arabidopsis 
thai iana 


ubiquinone/menaguinone 
biosynthesis 

methyl transf erase-like j 


762 


56 


1207 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


1208 


Mis U / poU 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1 9ft C> 


/ D J 0 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G))) 


181 


44 


1210 


U-i JL 


Mus musculus 


Ac3 9/physophilin 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secrptpd nrnhpin 
encoded by gene No. 12. 


1 


100 


1212 


AF117814 


Mus musculus 


odd- skipped related 1 protein 


945 


6<J 


1213 


AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14849 


Mus musculus 


me iosis- specif ic nuclear 
structural protein 1 


19S0 


77 


1215 


003022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103 . 


590 


100 


1216 


Z72510 


Caenorhabdit 


similarity to yeast UTR3 


634 


49 
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TABLE 2 



SBQ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






is elegans 


protein (Swiss Prot accession 
yk677hll.S comes from this 
gene 






1217 


249703 


Saccharomyce 


unknown 


134 


22 


1218 


ACO 3.3430 


Arabidapsis 

hhnl -i ana 


F3F9.18 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


1026 


71 




it / U /3U 


Caenorhabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


965 


SB 


~i22"A " 


ar.1 Q 1 C 
niilDj OXD 


Arabidops i s 

Via 1 * ana 


putative protein 


653 


61 


1222 


■rU* JL U U 


Homo s&pxens 


— , 

zinc finger protein KY-REN-21 
antigen 


2261 


100 


1223 


U U3U / x 


Bos taxxus 


GTP-binding regulatory 
protein gamma -6 subunit 


356 


100 


1224 




Homo sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


1225 




— 3 , 

Homo sapiens 


hypothetical protein 


714 


100 


1226 . 


X64002 


Homo sapiens 


RAP74 


2661 


99 


1227 


AU4U Q 3 


Homo sapiens 


catalase 


284G 


100 


1228 


AJO0S620 


Mus musculus 


skeletal muscle- specific gene 


1416 


90 


1223 


AF045564 


Rattus 
norvegicus 


development-related protein 


1715 


93 


1230 


J197571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


L0B239 


Homo sapiens 


located at OATLl 


2274 


100 


1232 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC024805 


Caenorhabdit 
is elegans 


contains similarity to 
TR:O04595 


744 


31 


1235 


AC006634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR418C (GB:U20162) 


357 


33 


t 11 a 

it JO 


718101 


Mus musculus 


macrophage actin-assocxated- 
tyrosine -phosphorylated 
protein 


1559 


87 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 






Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


WUU42;* 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4510 . 


324 


100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AL035602 


X\liJ]Js? XL» 

thai i ana 


putative protein 


499 


28 


1243 


*C764 83 


Gallus 
gallue 


Yes-associated protein 
(65k Da) 


574 


48" 


1244 


AF2201A6 


14 r-tmr\ can! onn 

nomo odpic u3 


uncharacterized hypothalamus 
protein HT012 


503 


100 | 


1245 


Ali02145'4 
nuv« j. j j 


/luiuu sapiens 


ci»j82iDii.3 i putative protexn) 


856 


100 


1246 


AJ27S003 


Homo sapiens 


GARi protein 


1216 


100 


1247 


Y57910 


Homo s ap x ens 


Human transmembrane protein 
HTMPN-34. 


1369 


98 


1248 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactosaminyl transfers 
se; similar to Q07537 
(PlD:gll71989) 


73 / 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein 1 


1139 


100 


~1250 


Y1314B 


Rattus 
norvegicus 


PAG60B 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron- specific protein PEP- 
19 


124 


46 
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TABLE 2 
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SBQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


inl In- 

WATERMAN 
SCORE 


IDENTITY 


1252 


AF14 673 8 


RaCtU3 
norvegicus 


testis specific protein 


"7ll 


d3 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6BC6. 


419 


97 


1254 


W44375 


Homo sapiens 


Human ubiqui tin- conjugating 
enzyme polypeptide . 


1045 


99 


1255 


AC006538 


Homo sapiens 


BC41195 1 


831 


78 


T25"S 


■-AB004316 


Bos taurus 


transformylase 


"1556 


""flfl 


1257 


Z35094 


Homo sapiens 


SURF- 2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid se<juence o£ 
protein PR0214. 




100 


1259 


AC006014 


Homo sapiens 


Simila.1T to T?1?P hrAnaFn-rm-fnrr 
ai* zr w clxj 0 l, \\M XlM 

protein; similar to P14373 
(PID:gl32517) 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


" 469 


100 


1261 


V00S07 


Homo sapiens 


ocyuciice \JX. unfK \ i. IS 

1st base in codon) (561 ia 
3rd base in r-f*>r3rtn \ 


984 


100 


1262 


X15443 


Rattus sp . 


(AA 1-568) 


697 


32 


1263 


AF173873 


Mus mus cuius 


neuronal P"S55 


977 


* 94 


1264 


AP178983 


Homo sapiens 




433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 

l) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PR0541 protein 
sequence . 


1622 


1G0 


1267 


AF061346 


Mus muo cuius 


Edpl protein 


1077 


64 


1268 


U97006 


Caenorhabdit 
i3 elegaris 






23 


1269 


AF233582 


Mus mus cuius 


GTPase Kab37 


942 


95 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3 127 


98 


1271 


AL031177 


Homo sapiens 


dJ889M15.3 (novel protein) 




55 


1272 


AF201933 


Homo sapiens 


DC11 




100 


1273 


"AF201933 


Homo sapiens 


DC11 


346 


98 


1274 


AL02171O 


Arabidops is 
thaliana 


putative protein 


lAa 


49 


1275 


AC004449 


Homo sapiens 


R33683_3 


556 


100 


1276 


Y86295 


Homo sapiens 


Human secreted protein 
HL2AG87, SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 
(HYDRL-9) . 


1576 




1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


478 


10O 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1280 


AF161380 


Homo sapiens 


HSPC262 


772 


100 


1281 


Y4B616 


Homo sapiens 


Human breast tumour- 
associated protein 71. 


779 


100 


1282 


AC015446 


Arabidcpsis 
thaliana 


Similar to aigi protein 


406 


35 " 


1283 
"1294 


AK024432 


Komo sapiens 


FLJ0O022 protein 


4 03 


35 


"1285 ^ 


^96153 


Homo sapiens 


Human FADD- interacting } 
protein (PIP) . 1 


1825 


81 




AJ001019 


Homo sapiens 


ring finger protein 


1301 


100 


1286 
"1287 


AE0C3823 


urosophila 
melanogaster 


CG13178 gene product 


195 


29 




^SF178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC0Q6033 


Homo 
sapiens 


similar to MlxN 64; similar to 

138027 <PID:g2135214) 


1195 


100 


1289 


AC006033 


Homo 
sapiens 


similar to MIN 64; similar to 
138027 (PID:g2135214) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 
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TABLE 2 



SEQ 
ID 
NO: 


ACC3SSION 
NUMBER 


SPECIES 


DESCRIPTION 


5MITH- 
WATBRMAN 
SCORE 


Ir 

IDENTITY 


1291 


Z73424 


Caenorhabdit 
is elegans 


C44B9.1 


235 


36 


1292 


Y94B71 


Homo 
sapiens 




1222 


100 


1293 


AF190425 


Homo sapiens 


protein RAP14 0 


*o9 


29 


1294 
1295 


G03B56 
AF133670 


Homo sapiens 
Mus mus cuius 


Human secreted protein, SEQ 
ID NO: 7937. 


538 
367 


99 ' 
51 


1296 
1297 


AJ249735 
X57560 


Homo sapiens 

Escherichia 

coli 


claudin-6 
pspE protein 


1142 
535 


100 
100 


1298 


AP1^9284 


Homo sapiens 


LIM and cysteine -rich domains 

^JJ- LfLCJ.ll J. 


1997 


100 


1299 


U41023 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
ykl09h8.5 


324 


29 


1300 


AB024523 


Homo sapiens 


basic kruppel like factor 


1206 


100 


1301 
1302 


X55989 
AF007151 


Homo s^ni ptic 


eosinophil cat ionic -related 

protein 

unknown 


737 


99 


1303 


X529Q4 


•Escherichia 
coli 


open reading frame (AA 1-65) 


1481 
359 - 


100 
100 


1304 
1305 


U19577 
AF266508 


coli 

Mus ltus cuius 


NELF protein — 


242 


93 


1306 
1307 


Y57901 


notuo oopiens 


Human transmembrane protein 
HTMPN-2S. 


1409 
932 


97 
100 




U58750 


is elegans 


carrier family 


365 


54 


1308 
1309 


AF044774 
ALQ78593 


Homo sapiens 
Homo sapiens 


breakpoint cluster region 
protein 2 

dJ210Bl.l (KIAA0680) 


2681 
267 


99 
34 


131Q 
1311 


X82693 
Z82263 


Caenorhabdit 


0 antigen 
C47A4.1 


620 
283 


96 
35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 


100 


1313 
1314 


Y41763 
AF196972 


Homo 
sapiens 
Homo sapiens 


Human PR0938 protein 
sequence . 
JM24 protein 


1636 


100 


1315 


AF053356 


Homo sapiens 


insulin receptor substrate 
like protein 


2239 
228 


100 
97 


131* 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1317 


AF153127 ■ 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 


1319 
1320 


AF153127 " 
XS6932 


Gallus 
gallus 
Homo sapiens 


SAPK interacting protein 
23 kD highly basic protein 


1651 


86 


1321 
1322 ~ " 


AF174605 


sapiens] 

>Y830B6 
Y83086 09- 
MAR-2000 28- 
AUG- 1998 P- 
box protein 
FBP-18. 
fHomo 
sapiens 


F-box protein Fbx25 


1044 
467 


100 
70 




M61732 


irypanosoma 
cxuzi 


neuraminidase 


214 


24 


1323 


Yi70l3 1 


porcine 
endogenous 


pol 


304 


64 
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SEQ 
ID 

NO: 


ACCESSION 
NUM3ER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


t 

IDENTITY 






retrovirus 








1324 


AL138655 


Arabidopsis 
thaliana 


putative protein 


1174 


37 


1325 


AL138655 


Arabidopsls 
thaliana 


putative protein 


946" 


35 i 


132^ 


AL133215 


Homo sapiens 


bA108L7.2 (novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 ^ 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


132B 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 
1332" 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide. 


232 


39 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence. 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc -finger protein ZBRK1 


411 


91 


1334 


282271 


Caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KI?4 comes from 
this gene 


578 


44 


1335 


AE000810 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


290 


43 


1.3 3 6 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus mus cuius 


protein phosphatase 


378 


84 


1339 


U44856 


Caenorhabdit 
is elegans 


weak similarity to TPR 
domains 


215 


40 


1339 


"AE001394 


Plasmodium 
falciparum 


protein o£ the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-il protein 


204 


69 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 68398- 
67B81 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPld 


2122 


100 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


ACO06963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
<PID:g46S0844) 


894 


35 


134S 


AF2S74^6- 


Homo sapiens 


N- acetyl neuraminic acid 
phosphate synthase 


1880 


99 


1346 


Y2S896 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 


1347 


AJ272073 


Torpedo 
raarmorata 


male sterility protein 2- like 
protein 


1664 | 


58 


1348 


AF161S48 


Homo sapiens 


HSPC063 


1018 


96 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


1117 


100 


1351 


G02144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 




SJj U DO J 


Escherichia 
coli 


similar to 


2047 


100 


13S3 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC005328 


Homo sapiens 


R2 6 660_1, partial CDS 


870 


74 


1355 


AC024B76 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1_CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


64 


1359 


AF217188 


Mus musculus 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


ZNF234 


3669 


100 


1361 


AL163279 


Homo sapiens 


homolog to cAMP response 


5035 


99 
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ACCESSION 
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j SPECIES 
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SMITH- 
WATERMAN 
SCORE 


IDENTITY 








element binding and beta 
transducin family proteins 






1362 


Z48475 


Homo sapiens 


glucoklnase regulator 


3160 


99 


13*3 


Z48475 


Homo sapiens 


glucokinase regulator 


2682 


" 9? 


1364 


AP195764 


Homo sapiens 


megakaryocyte- enhanced gene 
transcript 1 protein; MEGT1 
protein 


2055 


99 


1365 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1367 


AL117352 


Homo sapiens 


dj876B10.3 {novel protein 
similar to C. elegans 
T19B10.6 (Tr:Q22SS7)) 


2581 


99 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+Hnovi5. 


1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


3728 


99 


1370 


AF008220 


Bacillus 
subtilis 


YtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alpha- 2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base 
in codon) 


5908 


99 


1372 


Z98048 


Homo sapiens 


dJ408N23.4 (novel DnaJ domain 
protein) 


1296 


99 


13 73 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


"lamina associated polypeptide 
1C 


1567 


69 


1375 


0^3445 


Homo sapiens 


DOCl 


1645 


46 


1376 


AL117337 


Homo 
sapiens 


bA393J16.1 {zinc finger 
protein 33a (KOX 31) ) 


250 


60 


1377 


ACO05328 


Homo sapiens 


R2666 0_i, partial CDS 


1126 


100 


1378 


U35113 


Homo sapiens 


metastasis -associated gene 


1823 


69 


1379 


1*15313 


Caenorhabdit 
is elegans 


putative 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


13 B2 


AB037360 


Homo sapiens 


ANKHZN 


959 


97 


1383 


AF237676 


Mus mus cuius 


G beta- like protein GBL 


1721 


96 


1384 


AF237676 


Mus mu3culus 


G beta- like protein GBL 


1043 


70 


1385 


Y58793 


Homo sapiens 


Human calcium regulatory 
prot e in CaRKG- 1 . 


7^ 


100 


1386 


AF212162 


Homo sapiens 


nine m 


10369 


99 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


1388 


AC004B90 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA243 80 
>W06316 W06316 03-OCT-1996 
2 7- APR- 1995 TRP-1 protein. 


542 


86 


1389 


API 87 9 39 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390 . 


AC03S150 j 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


AF287894 


Homo sapiens 


PIST 


1410 


97 


1392 


AF282265 


Homo sapiens 


inner centromere protein 
INCENP 


1794 


99 


1393 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4584 


99 


1394 


AF076249 


Homo sapiens 


zinc finger protein SBBIZ1 


3208 


99 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6305. 


299 


75 


1396 




At abidops i s 
thaliana 


similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


65 


1399 


AL133396 


Homo 
sapiens 


dJl068H6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


V48611 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


Saccharomyce 
s cerevisiae 


putative HMG box 


i£4 


21 
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TABLE 2 



ID 

NO: 


1- COO 1 UIN 

NUMBER 




DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1403 


Y79222 


" Homo 


If timers #»^?» n «-» ^ moMp dp i a 


2842 


100 


1404 


X810S8 


Mus musculus 


tex261 


1010 


99 


1405 


AB012084 


HUB UIU6V.UAUO 




iin 


194 


29 


140*-- 


AB0302S1 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 


AJ010585 


rat bus 


rio-iuce procein 


2684 


99 


1408 




melanogaster 




3 64 


29 


1409 


TJ7fTfi1 fl 


Mus musculus 


N-RAP 


804 


48 


1410 




Homo sapiens 


J?20B87_1, partxal CDS 


835 


63 


1411 


AE000284 


Escherichia 

COJLl 


orf, hypothetical protein. 


360 


100 


1412 




Escherichia 
coli 


L5 (rplE) (aa 1-179) 


911 


100 


JL4X J 


W 78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 




ABU31Q5 1 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


141S 


AF097994 


Homo 
sapiens 


L- Jcynureni ne /alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y09945 


Rattus 
nerve gicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
auratus 


guanine nucleotide-binding 
protein beta 5 


2179 


76 


1420 


AL162458 


Homo sapiens 


DA465L10.5 (KIAA1176" (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2 ) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PR01604 (UNQ7BS) amino 
acid sequence SEQ ID NO: 3 08. 


152 


29 


1422 

._ 


Y94923 


Homo sapiens 


Human secreted protein clone 
qsl4_3 protein sequence SEQ 
ID NO: 52, 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer- ampl i f ieH 
transcriptional coactivator 
ASC-2 


10748 


99 


1424 


Y48517 


Homo sapiens 


Human breast tumour- 
associated protein 62. 


1B51 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 


89 


1426 


AF208848 


Homo sapiens 


JoM- 006 


853 


79 


1427 


AF11288 6 


3os taurus 


differentiation enhancing 
factor 1 


4693 


95 


1428 


U41387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


AF161534 | Homo sapiens 


norL.049 


2853 


78 


1430 


AP125043 


Mus musculus 


bisphosphate 3 1 -nucleotidase 


275 


30 


1431 


Y66718 


Homo 
sapiens 


Membrane- bound protein 
PRO1105. 


1866 


100 


1432 


AF193613 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044.^0 


Mus musculus 


Gliacolm 


1S>2 


34 


1434 


R92900 


Homo sapiens 


NTll-l nerve protein, 
facilitates regeneration of 
nerve cells . 


707 


51 


1435 


AF220530 


Homo sapiens 


myo- inositol 1 -phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-aasociated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging mtegrator-3 


1282 


100 


1438 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


59S 


98" 


1439 


AJ293659 


Homo sapiens 


mucol ipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long iso£orra 


3083 


100 


1441 


AF21913 8 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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TABLE 2 



PCT/US00/34263 



SBQ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


1442 




UAmn aani one 


-— — — — 


1944 


100 


14413 "" 


At*i37711 


urosopniia 

melanogaster 


VloDlQ 


191 


27 


1444 


AJ011Q9S 


nmn ojm^ ^nu 

tioiuo sapiens 


riaxx Decs procein 


439 


39 


1445 




tiomo sapiens 


phosphorylase kinase 


6233 


98 


1446 


AF2141- 4 


nomo sapiens 


breast carcinoma -associated 


3999 


99 


1447 


AF003924 


Homo 3apien3 


ANC 2H01 


2645 


99 


1440 


AF0O313 5 


Caenorhabdit 
1b elegans 


contains weak similarity to 
an AMF-Dinaing rootle 


2843 


52 


1449 


AF155112 


Homo sapiens 


NY -REN- 50 antigen 


1184 


89 


1450 


Y95004 


Homo sapiens 


Human secreted protein 
vc54_l, SBQ ID NO: 4 8. 


985 


100 




AMU / xU J 


Homo sapiens 


ataxin 2-binding protein 


688 


57 


1452 


AF107203 


Homo sapiens 


ataxin 2-binding protein 


456 


78 


1453 


Z38011 


Mus musculus 


DMR-N9 


882 


56 


1454 


X90568 


Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEIT@EMBL- Heidelberg . DE 


510 


28 | 


1455 


AL035409 


Homo sapiens 


dJ564Mll.3 (similar to 
sialyltranferase) 


1356 


100 


1456 


D44480 


Mus musculus 


MATH- 2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA he ilea se HDB/DICEl 


478 


45 


1455 


AF242552 


Gallus 
gallus 


retinovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB025258 


Mus musculus 


granuphilin-a 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodies t e rase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match to ESTs 243979 
(NID :g5 73097 ) , R19S99 
(NID:g774333) 


869 


98 


1464 


ACi004997 


Homo sapiens 


match to ESTs 243979 
(NIDjg573097) , R19699 
(NID:g774333) 


869 


98 


1465 


U32743 


Haemophilus 

influenzae 

Rd 


~Eucose operon protein (fucU) 


315 


50 


1466 


Y09022 


Homo sapiens 


Not56-like protein 


2342 


100 


1467 


AC003034 


Homo sapiens 


Homolog of rat kidney- 
specific {KS) gene 


1072 


99 


1468 


AP071544 


Spinacia 
oleracea 
1 


ribulose-i, 5-bisphosphate 
carboxylase /oxygenase small 
subunit N-methyl transferase I 


333 


26 


lib y 


XO tit J O 


Homo sapiens 


Human transmembrane protein 
HTMPN- 54 . 


10^3 


100 


1470 


AF032S6G 


Rattus 
norvegicus 


rsecS 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein- 17 (MECHP-17) . 


452 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Riboaomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


14 / 3 




Homo sapiens 


genethonin 3 


4026 


98 


1474 


t3H jj jo 


Homo sapiens 


HTSl 


1101 


50 


1475 


looz'ii 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1879 


98 


1476 


AJ010317 


Pugu 
rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdit 
is elegans 


coded. for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157> 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1473 


X82209 ■ " 


Homo sapiens 


MN1 


7116 


100 


1480 


U10536 


Pan pamscus | MHC. class I A 


675 


84 
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PCT/US00/342G3 



TABLE 2 



SEQ 
ID 
NOr 


NUMBER 


SPECIES 


t DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

IDENTITY 


1481 


AL078599 " 


Homo sapiens 


dJ9SlC6.1 (novel protein 
oAiii-Liat co . eiegans 

V^^fltO Q (Tv .dqi nor \ l 
rosHii.y I lx : P91086 / ) 


1274 


65 


1482 


Z98977 


romyces 
pombe 


jwm.av.ivc vacuoiat procej.n 


256 


29 


1483 


AB005662 


Mus musculus 


" JNK/SAPK- associated protein-1 


4968 


92 


1484 


Ai050126 


Homo sapiens 


"yj't.uiitsL.iccijL protein 


716 


100 


1485 


M27878 


Homo sapiens 


DNA binding protein 


1006 


53 


1486 


Y69161 


Huiiuj oapiens 


Ammo acid sequence of a 
partial protein kinase. 


575 


99 


1487 


' X84156 


sdccnarotnyce 


ATHl 


341 


29 






s cerevisiae 






1488 




Homo sapiens 


RNA hell case 


446 


34 


1489 


U56966 


Caenoxhabdit 
is eiegans 


coded for by c. elegans cDNA 
yx30b3.5; coded for by C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AE000989 


Archaeoglobu 
s fulgidus 


enoyl-CoA hydra tase (fad- 4) 


533 


46 


1491 


M80633 


■Dm*- |-, IC 

n r*i rcu^i A nun 
ilt_>L VC^XCUS 


aaenylyl cyclase type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3 513 


99 


1493 


Y17220 """ 


Homo sapiens 


Human secreted protein (clone 
fj283-ll) . 


4 62 


"37 


1494 




Mus musculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94897 


Homo 
sapiens 


Human protein clone HP10574. 


1371 


100 


1496 


AL049699 


Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


Too" 


1497 


AF037447 


Homo sapiens 


ribosomal S6 protein kinase 


2427 


100 


1498 


AL445067 


Thermoplasma 
acidophilus 


putative target YPL207w of 
the HAP2 transcriptional 
complex related protein 


269 


35 


1499 


AB039947 


Homo sapiens 


XllL-binding protein 51 


227 


36 


1500 


AJ277750 


Homo sapiens 


UBASH3A protein 


3509 


100 


1501 




Homo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 
1503 


AF17989^ 
AF178948 


Homo sapiens 


TALE horoeobox protein Meis2b 
TALB horaeobox protein Meis2a 


1140 


100 


1504 


Y53Q05 




Human secreted protein clone 
pn749_6 protein sequence SEQ 

tj-\ xrn . i c 
±u rsu : it> . 


1177 
1442 


100 
99 


1505 


XB2494 — 


Homo sapiens 


fibulin-2 


3580 


99 


1506 


X98296 




ubiqaitin hydrolase j 


783 


42 


"1507 


AL034548 


Homo sapiens 


dJ1103G7.S (novel protein) 


1099 


100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 

An^/v)a<4 h,\r nana *> 1 

cncoaea Dy gene zi. 


1736 


100 


1509 


AF220182 


Homo sapiens 


uncharacterized hypothalamus 
protein HT008 


1181 


98 


1510 


U646"01 


Caenorhahdi h 
is elegans 


aria nv^KaKI « m Uajv JL _ ^ \_ 

oene prooaoiy oegins in the 
next cosraid 


415 ■ 


58 


1511 


AL356192 


crassa 


icaolcu to vMtvxi, protein 


196 


29 


1512 
1513 


D17629 
AF168717 


Homo 
sapiens 
Homo sapiens 


N~ace tylgalactosamine 6- 
sulfate sulfatase (GALNS) 
x uu? protein 


1829 
694 


100 
99 


1514 
1515 ■ 


AJ243531 
AC003672 


Arabldopsis 
thaliana 


putative C3HC4-type RING zinc 
finger protein 


rjT 

735 

407 


100 
30 


1516 


A*llSi435 


Rattus 
norvegicus 


syntaxin 17 


13 74 


90 


1517 


AF003140 


caenorhabdit 
is elegans 


C4 4E4.5 gene product 


274 


31 


1518 


ABO 025 84 


Rattus 
norvegicus 


be ta- alanine -pyruvate 
aminotransferase 


223 8 


82 


1519 


AL121764 


Schizooaccha 


yeast atpl2 protein precursor 


270 


30 
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TABLE 2 



PCT/US00/34263 



1 SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DES CR T Pi 1 1 ON 


WATERMAN 


% 

IDENTITY 






romyces 
pombe 


nomolog 






"1S20 ■ 


AF^S'SSlO 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 


1521 


D31764 " 


Homo sapiens 


KIAA0064 


170 


27 


1522 


Y66634 


Homo 
sapiens 


Membrane- bound protein 
PRO190. 


985 


100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 ) 


1524 


ACQ001O7 


Arabidopsis 
thaliana 


F17F8.22 


277 


^i 


1525 


AF109377 


Mus musculus 


ldlBp 


"""1277 


83 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus musculus 


ctcicl SDhinooiuvelinanp-l -ilr*» 

phosphodiesterase 


1496 


79 


1528 


AK024423 


Homo sapiens 


FLJ00012 protein 


b jL-L 


100 


1529 


AP154502 


Homo sapiens 


quiescent cell proline 
dipeptidase 


679 


100 


1530 


AF205598 


Homo sapiens 




1366 


100 


1531 


AF251039 


Homo sapiens 


putative zinc linger protein 


1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
ciuooeu Dy gene / / clone 
HOBAS24 . 


4 93 


57 

■ 


1533 


AF039023 


Homo sapiens 


nan-uiir oinaingf procexn/ 
RanBP6 


5707 


99 


1534 
1535 


AC007190 
AB027564 


Arabidopsis 
thaliana 
Homo sapiens 


F23N19.9 
DINB1 


3 74 
4482 


3? 
100 


1536 
1537 


Y36178 
Y50907 


Homo sapiens 
Homo sapiens 


Human secreted protein 
numan tecai Drain cdna clone 
vb3_l derived protein. 


3 77 
3593 


87 
99 


1538 


AF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


1539 
1540 


AF266756 
248804 


Homo sapiens 
Homo sapiens 


cphingosine kinase 
0A1 


2011 


99 


1541 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pram 
domain: PF00169 (PH), 

N=l 


2238 
379 


100 
42 


1542 

' 1543 
1544 


Y71159 

X76092 
AB015330 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 

DNA binding protein RPX3 
HRIHFB20O7 


9415 
3327 


99 
100 


1545 
1546 


AF198487 
AF016417 


Homo sapiens 
Caenorhabdit 
is elegans 


transcription factor LBP-lb 
Similar to BZIP transcription 
factor 


631 

2B22 

518 


!>l/ 

100 

42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


1106 


100 


1548 
1549 


AB03549S 
AL021707 


Carassius 

auratus 

Homo sapiens" 


ubiguit in- activating enzyme ' 
El ^ J 
dJ508I15.4 (KIAA0668) 




42 


1550 


AJ223978 


Bacillus 
subtilis 


YvqK protein ~ 


3688 
292 


100 


1551 


AF145615 


Dro3ophila 
melanogaster 


BcDNA.GH03377 " 


822 




1552 


AL157734 


Schxzo3accha " 

romyces 

pombe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


37 


1553 


AF079527 


Mus musculus 


IER5- 


691 


63 


1554 " 


AB026291 


Rattus 
norvcgicus 


acetoacetyl-CoA synthetase 


1099 


88 


"T555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3 . 


1780 


99 


1556 
1557 


AF116553 
Y71056 


Drosophila 
melanogaster 
Homo sapiens 


antennal-speciric short-chain j 
dehydrogenase/reductase | 


277 


32 








Human membrane transport 


1975 


99 
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TABLE 2 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








protein, MTRP-1. 






i ecu 


X / JLvOO 


Homo sapiens 


Human membrane transport 
protein, MTRP-1 . 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
procem, mtrp-i. 


1894 


97 


1560 


AF092050 


Mus musculus 


beta-l,3-N- 

ace t y Lglu cosaminyl t rans f era se 


262 


44 


J. 3D J. 


^ T.I ftQQO? 


Homo sapiens 


dJ309:<20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4) ) ) 


1607 


97 


1562 


AJ131890 


Homo sapiens 


DMA polymerase lambda 


3002 


100 


1563 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


301S 


100 


1564 


AC002400 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AC005306 


Homo sapiens 


R27216_l, 


919 


82 


1566 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Scorea20.6, B-value=l . 9e-05, 
N=l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoforra C 


2879 


100 


1SS8 


D49473 


Muo musculus 


truncated form of Soxl7 


1047 


78 


1559 


AK025270 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C rau 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHIP-1 


2388 


100 


1572 


AE003831 


Drosophila 
melanogas t er 


CGI 8445 gene product 


180 


31 


1S73 


AF074603 


Streptomyces 
griseus 
subsp. 
griseus 


NonF 


205 


38 


1574 


U28993 


Caenorhabdit 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AP129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 | X64878 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophila 
melanogaster 


Diablo 


421 


54 


1578 


G00975 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5056. 


480 


100 


1579 


AF248744 


Cryptosporid 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1^80 


AL121782 


Homo sapiens 


dJ585I14.2 (novel protein 
(translation of cDNA 

a\v\ . e\t\\J U \)& Lj f ) 


663 


100 


1581 


AF041853- 


ncjmtj bapicns 


kinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441"" 




u f« lHLcrav<Lluy protein UlJrD 


1198 


100 


1583 


AE001803 


Thermo toga 
mat— it ima 


glycerate kinase, putative 


349 


34 


1584 


AF252283 


Homo sapiens 


Kelch- like 1 protein 


3973 


100 


1585 


AF169675 


can S (*T\ Q 


leucine- rich repeat 
(.^dnbumnmrduc procein cuici± 


3494 


99 


1586 


AF11B274 


nuiUhr cue* 




2628 


97 


1587 


X79440 


Homo sapiens 


NADP+ -dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapienB 


ZYG homologue 


3 966 


99 


1589 


AF169803 


Homo sapiens 


flavohemoprotein b5+b5R 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98_4. 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1*94 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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TABLE 2 



SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


k 

IDENTITY 






pombe 








1CQC 
JL9?D 


V* /OJ<I'i 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 


1318 


98 


JLD J> D 




Homo sapiens 


Human secreted protein clone 
iDo*3_j p race in sequence day 
TD TOO . 1 ft 


2236 


98 


1597 


AF17460S 




r — »ws protein tDXZb 


1408 


99 


1598 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
linger auiain cOK 


9676 - ™ " 


98 


1599 


X73114 


nouio sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Homo sapiens 


gpStaESO 


2305 


100 




X UUO / Q 


notno 
sapiens 


Human laph-1 protein 
sequence . 


1149 


9B 


1602 




Homo sapiens 


HIRA- interacting protein 3 


2821 


99 


1603 


AuZZzoO- 


Homo sapiens 


neutral sphingomyelinase 


2268 


99 


1604 




Homo sapiens 


neutral sphingomyelinase 


1601 


99 


,1605 


AF185576 


Mus mus cuius 


POZ/zinc finger transcription 
factor ODA-8 


3435 


97 


1606 


AF093744 


Homo sapiens 


unknown 


131 


106 


1607 


A12142 


synthetic 
construct 


IFN-pseudo- omega 2 


800 


98 


1608 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73 . 


1668 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 ^ 


681 


37 


1610 


X15218 


Homo sapiens 


Ski protein (AA 1 - 72 B) 


3765 


100 


1611 


Y08200 


Homo sapiens 


rab geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


99 


1613 


AC004481 


Arabidopsis 
thaliana 


nodulin-like protein 


371 


26 


1614 


Y09501 


Homo sapiens 


NADH-cytochrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 


3150 


97 


1616 


AJO1O750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein- 1, 
(CI PAR- 1) 


890 


62 


1617 


X58 079 


Homo sapiens 


S100 alpha protein 


481 


100 


i6ia 


Y66^678 


Homo 
sapiens 


Membrane- bound protein 
PRO1O09. 


9C7 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD- 014 protein 


288 


100 


1621 


AJ007509 


Homo sapiens 


ElB-SSkDa-aeeociated protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE00104S 


Ar chaeogl obu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


n ¥ 1 CCII1 "5 


Schiaosaccha 

romyces 

pombe 


mitochondrial carrier protein 


403 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 

PROH98 . 


1184 


100 


1626 




Sus scrofa 


destrin 


863 


100 


"I6T7 




Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203. 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ3 0M3.2 (novel protein) 


470 


100 


1629 


AF132484 


Mus mus cuius 


unknown 


236 


6-8 


1630 


AF017096 


melanogaster 


Biuiiiax to v.. exegans 
R10H10.6 and S. cerevisiae 
YD8419.03C 


493 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC2S0 


76^3 


100 


1633 


AJ001874 


Homo sapiens 


or! 


255 


97 


1634 


AC012187 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA- binding protein 
gb|H36135, gb|226200 come 
from this gene. 


143 


38 
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TABLE 2 



SEQ 

m 
lu 

NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1635 


AFU26246 


Homo sapiens 


~HERV-£ integrase 


411 


90 


Ibo t> 


Y50943 


Homo sapiens 


Human adult brain eDNA clone 
ve8_l derived protein. 


1126 


95 


163 7 


Ac X J 


Homo sapiens 


L-pipecolic acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1948 


96 


"l c-JQ 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk25l_l protein sequence SEQ 
ID NO: 90 . 


1320 


100 


1640 


AP235030 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 


Drosophila 
raelanogaster 


WDS 


358 


26 


1642 


Ml 93 51 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) . 


1352 


203 


1644 


AF176520 


Mus musculus 


WD repeat- containing F-box 
protein FBW5 


2676 


88 


164S 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42. 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein-l 


4456 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


1648 


Y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


166 


1650 


AC00713"6" 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


AB01S346 


Homo sapiens 


EpslSR 


4464 


39 


1652 


AL161576 


Arabidopsis 
thaliana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 1KIAA0601 protein) 


3526 


100 


1655 


AL031428 


Homo sapiens 


dJl84J9.1 <KIAAO601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
n\ discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-S. 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiqui tin- specific protease 


137 


35 


16"6"6 


AL078627 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 actin 
domains) 


320 


34 


-Lb b £ 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


1664 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


saccharomyce 
s cerevisiae 


unknown 


138 


26 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


139S 


99 


1667 


AC007842 


Homo sapiens 


BC331191_1 


1581 


47 


1668 


S67513 


Boma 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate. 
Peptide, 370 


p40 


3 97 


43 



186 
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TABLE 2 



SEQ 
ID 
NO: 


Accession 
number 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDBNTITY 






aa 









1669 


Z99753 


Schizosaccha 

romyces 

pombe 


" PUtative tdoLl-IvQ'P^-gim family 

nucleolar protein 


569 


47 


1670 " 


603130 


Homo sapiens 


ID NO: 7211. 


/ 


o-i"" 

97 


1671 


M9^625 


Gallus 
gallue 


cardiac muscle tensin 


11 DC 
llOS 


54 


1672 


" AF174482 


Homo sapiens 




2005 


99 


1673 


Y51946 . 


Homo sapiens 


Human 18.1 homolog protein 
£ raomerit 


233 


29 


1674 


AF255334 


Homo sapiens 


KXP35 


152 


" 29 


1675 


Y94 347 


" korob " ' 
sapiens 


Human nrn^Pin r\r\ udi r\cci 


109 


30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
ciiLuueu iruiD gene £ , 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gens 2 • 


1580 


91 


1^78 


AFt£3151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1679 


AF163151 


iiono sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1680 


AK0244S3 


Homo sapiens 


FLJ0O045 protein 


134 9 


100 


1681 


AF019236 


m di o coi deum 


TipD 


613 


34 


1632 


AJ243459 


IjC-Xshtn9.ru q 
ina j or 


proteophosphoglycan 


153 


26 


1683 


Z69369 


Schisosaccha 
pombe 


putative GTP- binding protein 


560 


46 


1684 


X94 9L0 


Homo nani *&r>m 




1334 


100 


1685 


AF286475 


rubripes 


retinitis pigmentosa GTPase 
■t- cry u id lui iiAc protein 


196 


19 


1686 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


40B7 " " 


100 


1687 


Au27598^ 


Homo fiaT5t#»r\e 


transcription factor 


2958 


100 


1688 


AJ275986 


** ^ Hi W LJ ^-V— llC) 


transcription factor 


1886 


88 1 


1689 


X07311 


Drosophila 
melanogaster 


heat shock protein 


138 


43 


1690 


AF2404 63 


Rattus 
norvegicus 


j. -interacting protein 
NUDE1 


1383 


83 


1691 


AJ272078 


Homo sapiens 


nrvod^'i ouiinuiatmg procem 


1256 


68 


1692 


AJ272079 


Homo sapiens 


APOBEC-l stimulating protein 


1336 


60 


1693 


AF177942 


Xenopus 
laevis 


katanin p60 


1664 


66 


1694 


AF2^3539 


Homo sapiens 


arginine N-methyltransf erase 


1774 


100 ~ 


1695 
1^96 


AF222689 


Homo 
sapiens 


protein arginine N- 

methyl transferase 1 -variant 2 


1182 


ai 




AK000193 


Homo sapiens" 


unnamed protein product 


1060 


100 


JLO i7 / 


AB041035 


Homo sapiens 


kidney superoxide- producing 
NADPH oxidase 


3122 


100 


1698 
1699 


AB041035 
AF025772 


Homo sapiens 
Homo sapiens 


kidney superoxide -producing 

NADPH oxidase 

C2H2 zinc finger protein 


2181 


100 


1700 
1701 


Y4467* 
AK022407 


Homo sapiens " 
Homo sapiens 


Human ARF- Related Protein- 1 
(HARP-l) . 


488 

938 "" " 
^ i ^ ™ " 


54 

"Snr 

97 


1702 
1703 


AB024574 
AP055078 


Homo sapiens 
Homo sapiens 


GTP-bmding like protein 2 
zinc finger protein 42 


1172 


98 
100 


1704 


AF198092 


Kus musculus 


RP42 


421 
10S7 


52 
77 


1705 


AE003573 


Drosophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaporin 


164 


24 


1707 


Y5S927 


Homo sapiens 


Human STLK2 protein. 


2144 


100 


1708 


U27121 


Danio rerio 


G12 


212 


47 


1709 


AI*391710 


Arabidopsis 


putative protein 


50$ 


50 
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TABLE 2 



ID 

NO: 


NUMBER 


isFh. (_lc;b 

thaliana 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1710 
1711 


B01311 
U40750 


W/ a \fim edn'i on m 

JTUNUQ O tip iCii b 

Mus mus cuius 


numan fkujh polypeptide, 
formin binding protein 30 


1649 
4561 


97 

85 — 


1712 




Mus mus cuius 


skeletal muscle and cardiac 
protein 


1490 


89 


1713 


AF255303 


- HowQ 

sapiens 


membrane-associated nucleic 
acid binding protein 


4416 


99 


1714 


AF2S530"* 

*V* £* *J >J ^ V ^ 


Hotno 


membrane -as so elated nucleic 
« c * a oinaing procein 


2960 


100 


1715 


tl08227 


Pat" t*i i « 
norvegicus 


Kas-reiacea protein 


511 


51 


1716 


AF16B795 


Rat t us 
norvegicus 


scnxaien-4 


1129 


44 


1717 


AF196304 


nuniu sapiens 


SUMO- 1- specific protease 


5B04 


99 


1718 


AL355737 


Homo sapiens 


HMG20A 


1782 


100 


1719 


auno qui 


Halocyntnia 
roretzi 


HrPET-1 


1069 


46 


t ion 


ArO / Ljx7 


Mus mus cuius 


C0P9 complex subunit 7b 


1297 


97 


T "7-51 




Homo sapiens 


HEYIj protein 


1681 


9* 


1722 


GQ1982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 6063 . 


718 


100 


1723 


AL032643 


Caenorhabdit 
is elegans 


similar to Uncharacterizad 
protein family UPF0034, 


825 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6053. 


586 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


100 


1726 




Homo sapiens 


CGI -2 01 protein 


4397 


99 


1727 


AF1B3426 


Homo sapiens 


HT004 protein 


1810 


99 


172 B 


DxQoa 4 


Bos taurus 


neurocal cin 


1002 


99 


x t j. y 


Z1S529 


Gallus 
gallua 


tensin 


1411 


84 


173 0 


Z73423 ~ " 


Caenorhabdit 
is elegans 


CDNA EST EMBL:ZI4908 comes 
from this gene-cDNA EST this 
gene 


"233* 


4 1 


1732 




Homo sapi ens 


PR00105 


470 


30 


1733 


AJ277724 


Homo sapiens 


hi s tone deacetylase 8 


2015 


"100 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus musculus 


leucine- rxch-repeat protein 


3531 


94 


1736 
1737 


AFU96709 


Drosophila 
virilio 
Homo sapiens 


failed axon connections 
protein 

dynactin p62 subunit 


276 ■ 


32 


1738 


L15314 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF01772 N=l 


2417 
206 


99 
37 


x 


A54blo 


Listeria 

monocytogene 

s 


pbosphadidyl inositol specific 
phospholipase C 


134 


27 


1740 


AL0316S8 


Homo sapiens 


CU310O13.4 (novel protein 
similar to predicted C. 
elegans an C. intestinalis 
proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
173. 


1013 


99 


1742 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1854 


61 


1745 


AF221O90 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSlA 


1224 


70 


i*M£ 

1747 


Y9d372 
Y94294 


Homo sapiens 
Homo sapiens 


Human PR01430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 
Human coenzyme A-uti Using 


1332 
842 


99 
100 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
enzyme CoAEN-2. 


SMI TH- 
WATERMAN 
SCORE 


* 

IDENTITY 


1748 
1749 


AK024436 
AE0OQB77 


~ Homo sapiens 
Methanobacte 
rium 

thermoautotr 
ophicum 


FLJ00026 protein ' 

conserved protein " 


1619 
231 


100 
" 36 


1750 
1751 


AF101361 
Y15067 


Drosophila 
melanogaster 
Homo sapiens 


"Abnormal X segregation 
2NF232 


193 
889 


33 
100 


1752 
1753 

1754 


AF251038 
AC003093 

X69089 


Homo sapiens 
Homo sapiens 

Homo sapiens 


GAP-like protein 

OXm'EROt-BiNDING PR6TEIN; 

45% similarity to P22059 

(PID:gl29308) 

165kD protein 


""822 
" 352 

5703 


100 
57 

99 


1755 
1756 


AL049795 
AL031393 


Homo sapiens 
Homo sapiens 


dJ622L5.3 (novel protein) 
tw / jjuii.i i^inc-xinger 
protein) 


103 9 
2765 


ICO 
100 


1757 

1758 
1759 


AB040672 

AL022238 
AF117653 


Homo sapiens 

Homo sapiens 
Homo sapiens 


acetylgalactosaminyltransfera 

dJ1042K10.4 {novel protein) 
double homeobox protein 


2020 
776 


99 
43 


17^0 
1761 


""YlS&e'S 
AL049712 


Homo sapiens 


CIJ68 6C3.2 (nucleolar protein 
nwopbb ; 


375 

2959 

2595 


54 
99 
99 


1762 


AC002394 


Homo — 
sapiens 


Gene product with similarity 
to dynein beta subunit 


1542 


51 


1763 


AF169017 


Homo sapiens 


formirainotransf erase 
cyclodeaminase 


877 


100 


1764 


U91541 


Homo sapiens 


human fortnimi no transferase 
cyclodeaminase (f ted) protein, 
carboxy- terminal end 


596 


100 


1765 
17fJ fct 


AB013365 


Bacillus! 
hal odurans 


YlgF 


350 


34 




Y38421 


Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


145 


71 


1767 

1768 
1769 
1770 

1 1 71 


AC009176 

AKC00647 
AJ238982 
U73522 


Arabidcpsis 
thaliana 

Homo sapiens 
Homo sapiens 
Homo sapiens 


putative ribulose-i, 5- 
bisphosphate 

carboxylase/oxygenase small 

SUbunit N-methvlhranttferaoA t 

unnamed protein product " 

VNN3 protein 

AMSH 


216- " 

737 

2665 

1214 


'27 

99 
99 
56 


X / FX 

1772 
1773 
1774 

1775 


U89435 
S70011 
AL035086 
Y99426 

AF11033O 


Kus musculus 
Rattus sp. 
Homo sapiens 
Homo sapiens 

Homo sapiens 


un Jen own 

tricarboxylate carrier 
dJ44A20.2 (novel protein) 

Human PR01604 (UNQ785) amino 

acid sequence SEQ ID NO: 300. 
giutaminase 


829 
1604 
2036 
1057 . 

3146 | 


86 
95 
100 
99 

100 


1776 
1777 

1778 


AJ269529 
Z81579 

AY007239 


Homo sapiens 
Caenorhabdit 
is elegans 
Homo sapiens 


g*j\.uiW4 ^ ^ii»j£»^»iicitc permease 

CDNA EST yJt76£l.S comes from 

this gene 

mono oxygenase X 


'nnah 

2787 
232 


100 
31 


1779 
1780 


AL109608 
AF254260 


Schizosaccha 

romyces 

pombe 

Homo sapiens 


oxysterol- binding protein 
family 

tuttelin 1 . " 


1875 

644 


99 
38 


x /ox 


1*07924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


1729 
247 " 


100 
50 


1782 
1783 


AF295773 
AK024475 


Homo 
sapiens 
Homo sapiens 


ral guanine nucleotide 
dissociation stimulator 
FLJ00068 protein 


142 
4333 


49. 
100 


1784 
1785 

i78£ 


AK024475 
503933 

S82637 J 


Homo sapiens " 
Homo sapiens 

nomo sapiens 


FLJ00068 protein 

tiuman secreted protein, SEQ 

ID NO: 8014. 

Ig lambda-like gene/beta- 


3996 
570 

247 


93 
100 

LOO 
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TABLE 2 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 




ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










glucuronidase exon 11 homolog 







TRADOCS:I4162SO.I(%CT40I!.DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL0024 0 


Receptor tyrosine Kinase 
class III proteins. 


BL00240B 24.70 8.250a- 
12 157-181 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 8.08Se- 
13 358-381 


4 


BL0 00 2 8 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.400e- 
10 1129-1146 BL0002B 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BLO0O23 
24.31 4,S45e-27 353- 
390 


6 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type 11 fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.54Se-27 353- 
390 


9 


BL0116Q 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5 . 119e- 
09 863-917 


id 


PR00464 


B- CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


n 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DMOO031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3 . 848e- 
09 79-113 


15 


PR002 08 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.8^8e- 
10 517-53S PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC-FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
14 232-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


BLQQB45 


CAP-Gly domain proteins. 


BL00845 16.43 2.200e- 
25 55-80 


20 




IMP dehydrogenase / GMP 
reductase proteins. 


BL00487B 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 2B7-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 B.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BLO6107A 18.39 3.250e- 
26 302-333 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


' BL00107 


Protein kinases atp- 
binding region proteins. 


BLOO107A 18.39 3.250e- 
26 302-333 


2£ 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
8.0006-17 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BLOOllSH 
14.34 9.392e-l6 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- 
14 983-1010 BLOOllSJ 
16.71 9.289e-14 591- 
617 BL00115I 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BLOOllSS 
11.65 6.011e-13 435- 
463 BL0011SK 15.03 
3.417e-10 617-659 
BL00115O 16.76 S.805e- 
10 863-913 BL00115P 
11.54 7.538e-10 913- 
953 BLOOllSS 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.47Se- 
09 1242-1265 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 B.820e-10 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins. 


BL00050A 23.71 9.250e- 
27 94-127 BL0O050B 
14.81 B.125e-12 133- 
147 


28 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- 
10 41-54 


29 


PF30756 


Putative esterase. 


PF00756C 14.12 l.lOoe- 
09 486-516 


32 


BL00557 


FMN- dependent aipha- 
hydroxy acid 
dehydrogenases proteins . 


BL00557D 17.76 5.0^5©- 
37 274-316 BLO0S57A 
35.08 8.909e^29 24-73 
BL00S57C 15.59 l.OOOe- 
28 227-257 BL005S7B 
21.27 8.898e-22 130- 
169 


34 




SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 S.886e- 
35 299-328 PR0062SF 
10.95 8.364e-32 334- 
361 PR00629B 13.56 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.D00e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PDQ1270A 17 "7"> i nnn» 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


36 - 


PD01270 


KECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- " 
40 39-79 PD01270B 
22. la 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PD01270D 24.66 3.7Q0e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL00412 ; 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 16.28 9.24le- 
10 264-298 


3B 


BL00412 


Neuromodulln (GAP-43) 
proteins. 


BL00412C 10.28 9.241e- 
10 254-298 


39 


BL00412 


Neuromodulin (GAP -43) 
proteins . 


BL00412C 10.28 9.24le- 
10 264-298 


40 


PR00*8O 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380B 12.64 7.3S6e- 
14 342-360 PR00380C 
13.18 6.927e-13 375- 
394 PR00380D 9.93 
2.1B0e-12 429-451 
PR00380A 14.18 5.154e- 
12 143-165 


44 


BL00345 


Ete -domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13.96 2.452e-l4 204- 
223 


45 


BL0034S 


Ets-domain proteins. 


BL00345B 21,28 l.OOOe- 
40 215-266 BL0034SA 
13.96 2.452e-14 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM01551C 
14.62 3.571e-l7 232- 
252 DK01551B 8.84 
4.750e-ll 214-226 


47 


PR00876 


NEMATODE MBTALLOTH IONE IN 
SIGNATURE 


PR00876B 7.66 9.328e- 
11 246-260 


48 


PD0106 6 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01O66 19.43 4.231e- 
33 6-45 


50 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22. S5 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BLO0972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL0O972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 | 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP -binding nuclear 
protein ran proteins. 


BL0111SA 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR0Q988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-l5 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988B 8.27 3.872e- 
11 174-136 PRO0988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.915e- 
09 57-69 


55 


PR007SS 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-l9 509- 
530 PR00762A 14.22 
9.333e-18 199-217 | 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RBSULTS* 








PR00762P 15.12 3.100e- 
16 563-563 PR00762B 
12.12 6.063e-l6 230- 
250 PR00762E 12.07 
2.286e-15 545-562 
PR00762G 14.13 6.276e- 
13 601-616 




BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 8.800e- 
10 153-203 


58 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF007£l 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14 .59 7.395e- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PROO360A 14.59 7.395e- 
09 670-683 


70 


PF00551 


BTB (also known as BR- 
C/Ttk> domain proteins. 


PF00651 15.00 8.714e- 
10 51-S4 


72 


DM0 0179 


v KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


"74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 6.116e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERTNE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-12 334- | 
351 | 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2,723e- 
13 282-295 PD02876D 
12.13 2.5B8e-12 393- 
410 


83 


BLC07Q8 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7,197e- 
12 570-601 


B4 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI3 KINASE P85 " " 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-Q9 264- 
279 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 31S-332 


9* 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BLO01O7A 18.39 4.000e- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PROOOBl 


GLUCOSE/ RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-l2 54-72 


98 


.PR0O380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D 
9.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SSQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


Jr.KU Uo UU 


J\if -Ut.l'lSNLJbN i CLP 

PROTEASE ATP -BINDING 
SUB UN TT Q rCMATTTO P 


PR00300A 9.56 7.545e- 
14 289-308 


104 


BL00479 


Phorboli esters / 
domain proteins. 


BL00479B 12.57 6.786e- 
±a 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.3008-13 272-29S 
ojjuua /yu xz . b / 6.294c** 
12 181-197 


106 


"BL01019 


ADP-ribosylation factors 

xauixiy |JiUtcill5 , 


BL01019A 13.20 8.013e- 
12 43-83 


107 


DM01970 


0 JCV ZK632.12 YDR313C 


DM01970B S.OOOe- " 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins . 


3L00191K 17,38 4.951e- 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


109 


PD01066 


P ROTS IN ZINC FINGER 
ZINC- FINGER METAL- 
BIDDING NTJ. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion shore toxins 
proteins . 


BL01138A 10.96 8.297e- 
10 38-50 


113 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 5.800e- 
23 156-187 BL00107B 
13.31 9.100e-14 225- 
241 


117 " ~ 




Cytosolic fatty-acid 
binding proteins. 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


lie 


BL00107 


Protein Kinases ATP- 
binding region proteins . 


BL00107A 18.39 8. 56*09- 
13 36-67 


119 


PRO 0 a 29 


GONADOTROPIN RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


PR00320 


G-PROTBIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PR00320 


G- PROTEIN B3TA WD-4 0 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e-" 1 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins . 


BL00215A 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.9^2e-ll 282-296 
BL01032I 10.42 8.902e- 

r\ q oiq ion 
U9 379-389 


129 


BL01310 


ATP1G1 / PLM / MAT 8 


BL01310 14.74 6.694e- 
26 28-64 


130 


PR00990 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
lb 47-67 PROO990A 
16.23 5.500e-14 20-42 
rKuuyyuL l-c.o/ 2.412e- 
09 119-133 


133 j 


BL00880 


Acyl - CoA-binding 
protein. 


BL00880 17.52 5.575e- 
26 72-122 


134 


BidoMo- 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13.98 6.779e- 
10 475-496 


136 


BL01310 


ATPICk / PLM / MATB 
family proteins. 


BL01310 14.74 2T^32e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.982e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 



195 



WO 01/53312 PCT/USOO/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 










3L00028 16.07 S.SOOe- 
13 74-91 BL00028 
16.07 9.100e-13 186- 
203 BL00028 16.07 
8.043e-12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 
BL00028 16.07 4.000e- 
10 156-175 


141 


BL00501 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.538e- 
14 113-133 BLOOSDIC 
9.61 8.688e-10 89-101 


143 


BL0102O 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146! 


PD01066 


PROTEIN ZINC FINGER 
2 INC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL00126 


3 ■ 5 • -cyclic nucleotide 
phosphodiesterases 
proteins . 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3 .951e-16 654- 
709 BL00126D 25.50 
l,360e-15 565-604 
BL00126B 15.2 0 8.200e- 
11 483-495 BL00126A 
27.56 8.269C-11 442- 
479 


151 


BL00632 


Riboeomal protein S4 
proteins. 


BL00632 23.79 S.271e- 
20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 

axidoreductases 

proteins. 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL0OS59J 19.63 
8.385e-13 99-151 
BL0D559L 13.60 5.814e- 
12 241-259 


155 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Ac tins proteins. 


BL00406D 12.58 2.547e- 
18 275-330 BL0040SA 
9.95 5.776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidase3, 
zinc-binding region l 
proteins. 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


165 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-156 


168 


BL00362 


Ribosomal protein si 5 
proteins. 


BL00362 24.67 9.700e- 
15 129-172 


169 


BL0063"9 


DEAD -box subfamily AT?- 
dapendent helicases 
proteins. 


BL00039D 21.67 1.000c- 
35 640-686 BL00O39A 
18.44 1.964e-13 212- . 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 


PD01066 19.43 9.4S5e- 
36 6-45 
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SEQ ID NO: 


ACCESS t'qM 

NO. 




RESULTS* 






" BINDING NTT 




180 


PR00007" 


Complement ciq domain 

C T^CTUTT TOP 


PRO0O07B 14.16 7.429e~ " 
20 160-180 PR00007A 
19.33 4.938e-19 133- 
AoU JrK.U0D07C 15.60 
1.225e-15 206-228 
rauuuw iu ? . b4 o . oobs- 
11 238-249 


101 


BL00027 


1 Homaobox 1 domain 
proteins. 


RT.Hftn^T O — AT q c ->tr n. " 
DLj\j\J\J jC 1 Jr. 9«oC~ 

24 280-323 


182 


BL00027 


proteins. 


JoJjUUUz / 26.43 9.526e- 
24 263-306 


183 


BL00027 


•Honeobox' domain 
proteins. 


BL00027 26\43 9.526e- 
24 280-323 


184 


BL00027 


*Horaeobox' domain 
proteins. 


BL00027 26.43 9.52de- 
24 263-306 


188 


PR00929 


AT- HOOK- LI KB DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 46C-471 


189 


PR00929 


AT - HOOK- LI kE DOKAlN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.3D0e- 
14 628-639 BL003 83F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7 . OOOe- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 83-105 PR00450C 
12.22 6.286e-13 47-69 


193 


PF005^4 


Octicosapeptide repeat 
proteins. 


PF00564B 24.74 6.164e- 
16 227-278 


194 




oK.ur.ui/yriHXiN i> i y w A TURK 


PR00503D 20.81 9.156e- 
15 204-224 PR00503B 
9.96 9.571e-13 170-187 


195 


BIi00901 


Cysteine 

■^yiiLuasc/ i-ybLaUJiiunine 

beta- synthase P- 
phosDhats att 


BL00901C 20.63 3.429e- 
18 67-117 


197 "f 


BL0063 6 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.211e- 
1/ oJjU0o36B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE ' 


PR00690A 10.86 9.8£6e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PKOLjQJE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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SEQ ID NO; 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.833e-lB 143-165 
PR00261D 12.47 7.500e- 
18 143-165 PR00261B 
14.12 5.065e-16 65-87 
PR00261C 11.37 B.967e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.180e-13 6S-87 
PR00261P 11.57 7.188e- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in ZO-1 
and Unc5-liJce netrin 
receptors. 


PF00791B 2U.49 6\l43e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 5.791e- 
19 131-1S8 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 
"213 


BL00193 


Ubi qu 1 1 in - con-) ugra t ing 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 




BLO0183 


Ubi qu i t in -con j uga t ing 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


215 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BL00039B 
19.19 4.064e-ll 277- 
303 


217 


BL00100 


Chloramphenicol 
acetyl transferase 
proteins. 


BL00100D 17.22 8.484e- " 
09 68-106 


219 


PR00213 


MYELIN PO PROTEIN 
SIGNATURE 


FR00213C 15.94 3.969e- " 
11 199-227 


~222 
~224 


BL00678 


Trp-Asp i WD) repeat 
proteins proteins. 


BL0067B 9.67 1.947e-09 
144-155 




PR00B75 


MOLLUSC METALLOTIII ONE IN 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


225 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.200e- 
19 18-39 


226 


BL00636 " * 


Nt-dnaJ domain proteins. 


BL00636A 8.07 l.OOOe- 
21 21-38 BL00536B 
15.11 8.200e-19 45-66 


229 
"230 


PR00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G " 
13.78 4.300e-12 361- 
382 




BL00460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL00460B 
9.73 7.429e-16 78-96 
BL00460C 14.35 2.831e- 
12 111-134 BL00460D 
16.89 8.773e-ll 140- 
160 


231 


FRQ0647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


Cyclins proteins. j 


BL00292B 20.31 7.429e- ' 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR00449 


XHANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
L3 7-29 PR00449C 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








17.27 4.4£2e-ll 4^-70 
PR00449D 10.79 7.120e- 
11 109-123 


*} 1 C 


PRO 0019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 251-265 PR00019B 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
1.000e-08 229-243 


236 


PRQ0019 


LEUCINE - Rt CH REPEAT 
SIGNATURE 


PR000193 11.36 7.300c- 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
l.OOOe-08 223-237 






PROTEIN SH3 DOMAIN 
REP3AT PRESYNA. 


PD00289 9.97 8.448e-09 * 
67-81 


240 


FRO 0 011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 61G-63S 


241 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


244 


BL009^3 


Cytidine and 
deoxycyt idyl ate 
deaminases zinc- binding 
region s . 


BLO0903 12.93 B.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DMO0179 13.97 B.043e- 
09 124-134 


248 


BL00246 


Wnt-1 family proteins. 


"BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 
351 BL00246B 13.69 
4.176e-36 105-140 
BLO0246A 15.75 2.286e- 
24 70-90 BL00246C 
15.56 4.857e-22 150- 
175 


250 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


^54 ; 


BL00674 


AAA-protcin family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 6.045e- 
09 61-88 


256 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002B 15.18 2.800e- " 
10 421-435 


25 B 


PR00094 


ADENYLATE KINASE 
SIGNATURE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PRO0094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7,333e-13 178- 
193 


259 j 


BL00892 


HIT family proteins. 


BL00892A 18.17 5.500e- " 
13 60-91 


252 


BL0038 8 


Proteasome A- type 
oubunits proteins. 


BL00388A 23.14 l.OOOe- 
40 3-S4 BL00388B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OQQe- 
21 153-184 BL00388C 

J.O./3 o.J.4/e-J.O lib- 

148 


26"4 


BL00903 


Cytidine and 
deoxycyt idyla te 
deaminases zinc -binding 
region s. 


DL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


270 


BL00226 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 
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SEQ it) NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6.143e- 
15 96-111 


271 


PD029S2 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULTIGENE FAMI. 


PD02952C 15\76 9.731e- 
16 235-265 PD02952B 
15.57 5.62SO-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 i.OOOe- 
40 106-160 PD02929B 
18.36 8.B00e-l7 179- 
199 


274 


BL01O27 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PRO0424D 14.32 6.451e- 
11 39-59 


277 


BL00052 


Ribosomal protein S7 
proteins. 


BL00O52A 27.85 6.000e- 
13 137-184 BL00052B 
15.17 5.143e-12 208- 
23 5 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


230 


PR00319 " - 


BETA G- PROTEIN 
(TRANS DUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 1.000a-21 89-105 
PR00319A 15.27 8.364e- 
21 51-CO PRO0319B 
11.47 8.200e-l9 70-85 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUC IN ) SIGNATURE 

m 


PR0031SD 11.64 6.625e- 
23 94-112 PR00319C 
13.41 l.OOOe-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 B.200e-19 57-72 , 


287 


PF00*29 


Exonuclease. 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.3GOe- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I. 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.-07 S.SOOe- 
15 322-339 BL00028 
16.07 9.471e-14 433- 
450 BLO0O28 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 78B- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL0002B 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 5.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
SOS 


296 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.37Se-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PP00953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.68 5.000e-25 102- 
129 PP00953B 6.17 
I.000e-13 182-194 


i n a 


PF001 52 


tRNA synthetases class 
II. 


PF00152D 21.30 8.364e- 
28 422-461 PF001S2C 
28.03 9.2S0e-2l 220- 
257 PF00152B 15.67 
2.6S8e-13 159-184 
PF001S2A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN 2 INC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


30S 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR004S4 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.809e- 
09 1167-1186 


"309 


PRO 023 7 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 5.09le- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 


309 


BL00522 


DNA polymerase ramily x 
proteins. 


BL00522C 11.90 7.577e- 
24 315-339 BL00522P 
14.90 1.310e-lS 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E~19.63 8.615e- 
14 430-460 BL0052.2B 
27.30 9.625e-12 267- 
313 


310 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 5.235e- 
10 856-897 


"312 


BL00290 


Immunoglobulins and 
complex proteins. 


BL0 0290A 20.89 4.706e- 
14 151-174 BL00290B 
13.17 9.000e-12 211- 


313 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


'315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.09-le- 
15 63-76 


317 " " 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TxROSlkte KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814e- 
10 216-235 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


• Homeobox 1 domain 
proteins . 


BL00027 26.43 S.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.3136- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP -43) 
proteins. 


BL00412D 16.54 4.00Qe- 
12 515-566 BL00412D 
16.54 S.70Se-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-559 
BL00412D 16.54 l.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
554 BL00412D 16.54 
2.102e-09 520-571 


328 


BL66232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.SS7e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL0O232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.32Se-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PRO 04 54 


fcTi'S DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 S.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins. 


BL01016C 22.84 3.925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.65 7.167e- 
10 4-19 BL01016P 
13.34 1.563e-09 200- 
212 BL010163 8.93 
8.8SSe-09 38-50 


339 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN 2INC PINGBR 
ZINC-FINGER MBTAL- 
BINDING NU. 


PD01066 19.43 1.231e- j 
33 10-49 


341 


BtOii^tt 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR00109B"i2.27 4.764e- 
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SKU ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* • ™ ■ 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


BL011B7 


Calcium- binding EGF-Uke 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL011873 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


352 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
■ 10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 


BLO0380 


Rhodanese proteins. 


BL00380F 9.7£ 6.694e- 
11 542-553 


3SS 


PF0062B 


PHD- finger. 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


359 


PDOOOt^ 


PROTEIN ZINC- FINGER 
METAL-BINDI . 


PD0Q06£ 13.92 4.462e- 
15 261-274 PD0006S 
13.92 6.500e-13 233- 
246 PD000S6 13.92 
4.300e-09 289-302 


361 


PF00791 


Domain present in 20- l 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-l 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 


RECOVER UN FAMILY 
SIGNATURE 


PR0045OC 12.22 S.080e- 
10 73-95 PR00450C 
12.22 3.278e-09 109- 
131 


364 


PF00242 


DNA polymerase (viral) 
N-terminal domain 
proteins. 


PF00242O 13.51 2.328e- 
09 22-68 


365 


Pf , 66i45 


DNA polymerase (viral) 
N-terminal domain 
proteins. 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain . 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 1038-1092 


367 

■^Tfl 


PRO 0013 


LEUCINE -RICH REPEAT 
SIGNATURE 


PJR0Q019B 11.35 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-3B4 




PR00011 


TYPE IH EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9.00de- 
15 30-49 PR00011A 
14.06 9.830e-15 30-49 
PR00O11B 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


B"L01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


T73 
376 


PD01066 
PR00170 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 

SODIUM CHANNEL SIGNATURE 


PD01066 19.43 9.757e- 
34 26-65 

PR00170E 6". 48 2.739e- 
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SEQ ID NO: 


ACCESSION 
NO - 


DESCRIPTION 


RESULTS* 








10 88-118 


380 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 l.OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


•>0Jk 


O.LUU4 5i> 


Putative AMP-bmding 
domain proteins. 


BL00455 13.31 S.714e- 

12 so-66 


TOO 


FRQQ6 24 


HI STONE H5 SIGNATURE 


PR00624G 4.08 4.900e- 
09 524-544 


1(11 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
10 366-379 PD0007BB 
13.14 4.522e-09 168- 
181 


385 


PRO 05 11 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PDQ2870 


RECEPTOR INTRRTiEUKIN-1 
PRECURSOR . 


PD02870B 18.83 ^.D00e- "* 
10 97-130 


388 


PDQ0066 


PROTEIN ZINC- FINGER 
METAL- BINDI. 


PD00066 13.92 5.000e- 
13 516-529 


389 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.6S7e- 
09 151-174 


390 


BL00215 


Mitochondrial energy 
transfer proteins . 


BL00215A 15.82 5.200b- 
15 221-246 BL00215A 
15.82 7.618e-14 20-45 
BL00215A 15.82 8.85le- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL0021SB 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins. 


BL00674B 4.46 2.723e~ 
16 299-321 


397 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 B.579e- 
11 141-155 


398 


PR00761 


BIND IN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BLO0240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF00676 


Dehydrogenase El 
component . 


PF00S76B 7.4.71 8 . 071e- 
18 331-369 PFO0676D 
14.40 3.854e-15 486- 
506 PF00676C 16.88 
9.182e-14 454-478 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00514C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6. 092B-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00S14H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. j 


Pi'00992A 16.61 5\474e- 
09 105-140 


404 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B ! 
11.36 1.000e-09 96-110 


405 


bL0D232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9-S57e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








294 BL00232B 32.79 
9.384e-15 463-Sll 
BL00232C 10.65 2.537c- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL0O232C 10.65 
7.261e-ll 461-479 
BL00232C 10.65 7.4S7e- 
11 27-45 


407 


PF00426 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PP00426S 15.67 5.634e- 
09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine - nucl eo t ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PFO0646 


F-box domain proteins. 


PF00646A 14.37 G.344e- 
09 86-100 


412 


BLO0603 


Thymidine kinase 
cellular-type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BIiO0B66 


Carbamoyl -phosphate 
synthase subdomain 
proteins. 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


4X8 


PROQ239 


MOLLUSCAN - RtfODOPStN C- 
TERMINAL TAIL SIGNATURE 


PR00239E l.*8 *.114e- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors. 


PF00791B 28.49 7,95Se- 
14 23-78 PF00791B 
28.49 3.653e-l2 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244 PF00791B 
28.49 7.028e-09 435- ' 
490 PF00791B 28.49 
8.679e-09 367-422 


4124 


DM00892 


3 RETROVIRAL PROTEINASE . 


DM00392C 23.55 7.207e- 
28 1545-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e- 
10 228-251 


429 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BI*00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 l.B44e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BL00039B 19.19 
8.920e-l6 251-277 
BL00039C 15.63 5.781e- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR0082BB 5.23 8.218e- 
10 382-405 


436 


BL.O0415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF0114 0D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






P15. 


10 183-218 PP0ll4db 
15.54 3.093e-09 246- 
281 


449 


PR0056S 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PRC0568G 13.95 5".55le-'~ 
09 39-53 


451 


PF00084 


Sushi domain proteins 
(SCR reneat nr^pinn 


PF00084B 9.45 3.813e- 
10 47-59 


452 


BL00790 


Receptor tyrosine Jcinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 618-649 


456 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR0038OA 14.18 l.OOOe- 
25 77-99 PR00380D 
9,93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-16 194- 
212 


~TS7 — 


PR00253 


GAMMA -AMI NOBUTYRiC ACID 
(□ABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 9.143e- ■ 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 5.950e- 
21 452-473 


467 


PR0084 0 


Ubi^Ooiu HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


471 


RL00fi7fl 

DJJUUO / O 


Trp-Asp (WD) repeat 
proteins proteins. 


BL0Q678 9.67 8.200e-12 
33-44 


472 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 3.721e- 
09 282-330 


473 


BL00344 


GATA- type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 


Thiol-actlvated 
cytolysins proteins. 


BL00481E 1^.07 B.909e- 
09 173-199 


"479 




BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 2.571e- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGBR METAL- 
BINDING NU. 


PD01066 19.43 1.90Oe- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR00405A 17.71 
4.97le-18 411-431 


482 


PR00049 


WILN'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9.8S7e-10 958-973 
PR00049D 0.00 1.305e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR000O7 


COMPLEMENT C1Q DOMAIN 

SIGNA.TTT9P 


PR00007B 14.16 8.t>15e- 
23 653-673 PR00OO7A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
o.aabe-19 698-720 
PR00007D 9.64 3.647e- 
J.J /J^-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.853B- 
09 200-214 


488 


PRO 09 8 8 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


NILM'S TUMOUR PR6TEIN 
SIGNATURE 


]PR00049D 0.00 7.864e- 
09 663-678 


492 
497 


BL01128 
PF00429 


Shikirnate kinase 
proteins. 

env polyprotein (coat 


BL01128A 18.84 6-4 64e- 
17 58-92 

fF00429 31.08 7.171e- 
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seq id Mor 


ACCBSSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BL00120 


proteins . 


BJj0U12OB 11.37 7.923e- 
09 185-200 


500 


""BL06036 


Eukaryotic RNA-binding 
region RNP-i proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


501 


B L01159 


WW/'T«m , WWWP Hnma \ i-i 
fin / ispj/ nnr GLOulaXn 

proteins . 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


ivtiiiyxe uomain procej.ns. 


JHL00021B 13.33 3.739e- 
17 492-510 


508 


PR00120 


(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e- 
19 705-722 


509 


DM0141 7 


o Kw JLJJJUUCIKG JCPMC2 
MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


510 


PF00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycasyi transferases 
group l. 


PF00S34B 14.47 6.^25e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group l. 


PF00534B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01B41A 21.71 l.OOOe- 
40 110-160 PD01B41B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 
1.000e-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01B41Q 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.750e-36 295- 
333 PD01841J 14.94 
6.023e-35 B51-888 
PD01841H 21.30 2.909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 
9.3B6e-23 222-243 
P001B41M 10.82 8.594e- 
21 1054-1073 PD01B41I 
23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHXL1N PEPTIDYL- 
PROLYL CIS -TRANS 

J.iDUlTCihCrti>£» o IMMATURE 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.188e- 
12 410-423 


516 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins. 


BL00242C 16.36 8.320e- 
09 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION.- 


DM00031A 16.80 3.7S0e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 


BL00319 


Amyloidogeni c 
glycoprotein 
extracellular domain 
proteins. 


BL00319C 17.12 8.375"e^~~ 
1Q 61-95 


526 


PFO07B9 - 


Domain present in 
ubiqui tin-regulatory 
proteins. 


PF00789B 19.70 3.308e- 
12 322-343 PF007B9C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Quinone oxidoreductase / 
zeta-crystallin 
proteins . 


BL01162C 22.80 l.SOOe- 
16 120-164 
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SEQ ID NO: 


ACCESSION 
NO. 


« DESCRIPTION 


RESULTS* 


529 


PR00910 


LUTEOVIRUS ORP6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3.893e- 
09 60-73 


532 


BL00215 


Mitochondrial energy 
transfer proteins. 


Btti02lSA iS\82 4.000e- 
17 11-36 BL00215A 
15.82 B.660e-ll 123- 
14 8 


533 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL00098 


Thiolases acyl-enzyme 
intermediate proteins. 


BL00098C 21.65 ^OOOe- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BLO0O98D 26.30 8.364e- 
35 245-298 BL00098E 
22.12 1.000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535" 


PR00370 


FLAVIN- CONTAINING 
MONOOXYGENASE <FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370P 17.75 
6. 559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.429e- 
16 285-302 BL00028 
16.07 6.294e-14 341- 
358 BL00028 16.07 
1.346e-ll 369-386 
BLOC028 16.07 1.692e- 
11 397-414 BL0002B 
16.07 4.4S2e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 


537 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


53d 


BL00762 


WHEP-TRS domain 
probeins. 


BL00762A 23.43 9.419e- 
15 819-856 


539 


BL00762 


WHEP-TRS domain 
probeins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL- TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL . 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


BL00028 


Zinc tinger, C2H2 type, 
domain proteins . 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins . 


BL00250A 21.24 8.000c- 
31 293-329 BL00250B 
27,37 5.286e-24 354- 
390 


"547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714a- 



208 



WO 01/53312 



PCT/US00/34263 



SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDUCIN) SIGNATURE 


09 186-201 PR00319A " 
15.27 7.344e-09 210- 
227 


~S48 


" BL01204 


NF-kappa-B/Rel/dorsal 
domain proteins . 


BL01204A 17.74 l.OOOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-33 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 


3^* ;» 




GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-276 




PF00632 


HE CT- domain (ubiquitin- 
transferase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e-21 1515- 
1543 


554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 1.600e- 
14 187-205 BL00290A 
20.89 2.059e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3 . 


DM00215 19.43 £.339e- 
09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PF006SB 


Poly-adenylate binding 
protein, unique domain 
proteins. 


PF00658C 16.33 9.455e- 
32 118-155 




"BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 
10 472-488 


566 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 
ZINC FINGER METAL- 
BINDING NU. 


PD0106S 1^.43 4.977e- 
13 229-268 


569 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 S.S00e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins. 


"BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


572 


PRQ0193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.8S7e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.583e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


573 


PR0O193 


MYOSIN 1 HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.8$7e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR00193B 11.69 

PR00193A IS. 41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
553 


575 


BIi00752 


XPA protein. 


BL00752B 19.17 9.703e- 
10 885-929 


576 


BLQ0030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BL00U6 j 


UNA polymerase family B 


3L00116A 12.81 5.737e- 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 1 


RESULTS* 






proteins . 


13 B64-B77 BL00116B 
11.82 l.S29e-12 952- 
96S 


578 


BL00195 


Glutaredoxin proteins . 


BL00195B 15.31 7.1S8e- " 
09 121-141 


579 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.36Oe-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


£80 


PR00253 


GAMMA- AMIN0BUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 

» 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
S.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 1^.85 2.2t1£e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.95 8.230e-10 1686- 
1705 


584 


DM0153 7 


~kw SKI2W SKI2 NUCLEOLAR 
HBLICASB. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.53 9.491e-30 916- 
963 DM01537A 15.14 
3.196e-ll 784-804 


586 

• 


PFC0013 


KH domain proteins 
family of RNA binding 
proteins. 


PP00013 5.78 1.450e-09 
124-136 


587 


DM00892 


3 RETROVIRAL PROTEINASE? . 


DM00892C 23.55 <J.409e- 
13 262-296 




BL00478 


LIM domain proteins . 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


590 


PF00 855 


PWWP domain proteins. 


PP00855 13.75 S.OOOe- 
15 931-948 


591 


PP00855" 


PWWP domain proteins. 


PF00855 13.75 8.000e- 
15 1062-1079 


593 


PF00628 


PHD-finger. 


PF00528 15.84 3.455e- 
12 424-439 


594 


PR00205 


CADHERIN S^GNATtiRE ' 


PR00205B 11.39 2.241e- 
16 558-576 PR0020SA 
14.73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A IB. 39 4.789e- 
18 307-338 


598 


PDQ1675 


GLYCOPROTEIN MAJOR 


PD01675C 19.89 2.330e- 
10 55-39 


600 


BL00242 ■ ' 


Integrins alpha chain 
proteins . 


BL00242EJ 9.03 9.591e" 
27 985-1014 BL00242C 
16.86 4.1l5e-26 286- 
316 BL00242D 13.57 
4.150e-25 357-382 
BLO0242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 | 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








5.O00e-ll 6^1-73 
BL00242D 13.57 4.986e- 
10 291-316 


601 


* PR00320 


G- PROTEIN BETA WD-40 


PRO0320A 16.74 5.610e- 
09 198-213 


602 


PR00278 


" PANCREATiC R5RMONE ~~ 


PR00278A 12.43 4.5^9e- 
10 331-348 


603 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.250e- 
12 170-183 


604 


OXjU U 3 J.D 


Dehydrins profceins . 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL0041S 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


"PF00855 


PNWP domain proteins. 


PP00855 13. 7S 5\l£7e- 
15 265-282 


609 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLBOCAPSID . 
PROTEIN. 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10;69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.59 
8.29le-09 767-787 


615 


PD02699 


PROTEIN DNA-BINDING 
BINDING DNA. 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
I.000e-17 15B-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455- 


617 


PRO 03 80 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.08fc>e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR003B0C 
13.18 2.976e-13 436- 
455 


618 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM012C6B 10.69 5.143e- 
12 531-551 DM01206B 
10.69 2.603e-10 535- 

555 \ 


621 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


Donninnn 1 c an 1 irn tt 
rXUU / ^UB Ib.oU ^ . JLove - 

21 561-582 


622 


BL00239 , 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.15 3 . 222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


(523 


PRO 0407 


EUKARYOTIC MOIiYBDO PTERIN " " 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448b- 
09 326-339 


624 


BL00641 


Respiratory- chain nadh 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
4 0 157-202 BLO0641B 
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SEQ ID KO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunit proteins. 


24.37 1.000e-40 255- 
308 BL00641F 33.12 
1.000e-40 571-623 
BL00641A 17.15 1.818e- 
37 48-80 BL00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


627 


" "P*00103 


CAMP - DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103B 17.80 2.500e- 
10 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


630 


PR000B1 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10. S3 i. ills- 
16 4-22 


631 


PP00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.500e- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10. £9 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10. S9 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


635 


3L00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins. 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-2C 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


"642 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


"647 


PF0062B 


PHD- ringer. 


PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


64 a 


BL01129 


Hypothetical 
yabO/yceC/sfhB 6amily 
proteins. 


BL01129E 13.25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BIi01129B 12.51 
6.118e-13 191-212 


649 


BL01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 3.90Be- 
10 455-480 


650 




' Homeobox 1 doma i n 
proteins. 


BL00027 26.43 6.684e- 
13 771-814 


651 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL5C002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA - AM INOBUTYRIC ACID 
(GAB A) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 a.B00e-24 313- 
335 PR002S3B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SE16. ID NO: ■ 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


554 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD017X9A 
12.89 3.961e-10 128- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 563-578 


£58 


BL00354 


HHG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397©- 
09 580-595 


659 


DM00215 


PR0LINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 

1j / i DMUUzXb 

19.43 4.7S0e-12 549- 
coo nMnnoi c io ai 

9.824e-ll 551-584 
10 548-581 DX00215 

17 i ^.UD^ie — Xl» 33U — 

583 DM00215 19.43 

DM00215 19.43 7.107e- 
10 544-577 


660 


PR0068B 


XYLOSB ISOMERASE 
SIGNATURE 


PR00688I 13.78 9.51Se- 
09 224-236 


$61 


BL00027 


1 Home obox ' domain 
proteins . 


BL00027 26.43 S.950e- 
23 249-292 


662 


PR00360 


C2 DOMAIN SIGNATURE 


PR0036OB 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.1582- 
10 596-610 


664 


£r w \J Q \J 


C"J TV1MJVTM QTfiMHTflUH 


i. 'KUUOOUO 1J . DJL / < JLd0c~ 

10 596-610 


666 


PR00819 


CBXX/CFQX SUPERFAMILY 


PR00819B 10.83 8.980e- 

Xv /U^l 1 CM 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-178 


668 


PR00019 


LEUCINE- RICH REPEAT 


PRdOOi9B 11.36 1.3£0e- 

11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


31*00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 3.25Ce-10 
681-694 BL00018 7.41 
6-400e-10 717-730 


672 


PD0O131 


ATP- BINDING TRANSPORT 
TRANSMEMBR. 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR30667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PRCOc^G 15.33 7\£57e- 
10 106-123 


674 


PR00320 


G-PR0TEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PRD0320C 13.01 
8.4356-11 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-09 593-608 


675 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.115e-12 614- 
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SEQ ID NO: 


NO. 


DESCRIPTION 


RESULTS* 








629 PR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13.01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 


676 


PRO 00 19 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.667e- 
09 249-263 


D /3 


PF00642 


Zinc finger C-x8-c-x5-c- 
x3-H type (and similar) . 


PF00642 11.59 3.700e- " 
16 225-236 PP00642 
11.59 7.900e-12 187- 
198 


680 


PR003 08 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 B.754e- 
10 286-296 


68X 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019D 15.33 4.200e- 
19 227-257 


682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.000e- 
09 99-118 


687 


PR00049 


WILM'S TUMOUR PROTBIN 
SIGNATURE 


PR00049D 0,00 8.500e« 
10 538-553 


689 


BL01024 


Protein phosphatase 2 A 
regulatory subunit PR55 
proteins. 


BL01024A 10.26 l.OOOe- 
40 22-69 BL01024B 
8.91 1.000e-4O 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13.22 1.000e-40 185- 
222 BL01024E 11.96 

I. OOOe-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BL00027 


'Homeobox* domain 
proteins . 


BL00027 26.43 B.071e- 
31 152-195 


692 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


693 


BL00211 


ABC transporters family- 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.0S0e- 
09 58-70 


696 


"BLOO680 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BLOQeao 14.37 5.304e- 
17 173-195 


697 


BL00741 


Guanine - nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


698 


DMO1930 


2 kw FINGER SMCX SMCY 
YDRD96W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930P 
14.16 B.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DN A -POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e^ 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR00048A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320c- 
09 161-175 


702 


BL00S23 


Sulfatases proteins . 


BL00523E 19.27 2.S65e- " 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00S23B 8.64 S.909&- 
15 86-9B BL00523C i 
12.64 5.500e-13 137- 
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SEQ ID NO': ■ 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








148 BLO0S23D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-S23 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 fcl.412e- 
12 376-390 PR00048B 
6.02 1.000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD00787A 14.84 8.94ie- 
14 66-82 


708 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761E 11.32 8.500e-" 
10 822-841 


712 


DM01354 


kw TRANSCRIPTASE kEV£kSE" 
II 0RF2. 

• 


DM01354Y 10.69 4.977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 
10 356-376 


713 


BL00039 


DEAD -box subfamily ATP- 
dependent hel leases 
proteins. 


BL00039D 21.67 7.54Se- 
27 4S0-496 BL00039A 
18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.216e-14 280-304 
BL00039B 19.19 1.947e- 
13 194-220 


715 


BLQ0383 - 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 ■ 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 2.688e-28 84-118 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BL00243 


Incegrins beta chain 
cysteine-rich domain 
proteins. ! 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 3S8-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.3B0e- 
09 610-653 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 B.022e- 
09 20-36 


722 


PRO 07 04 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
8.07le-26 165-189 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


Results* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-l9 30-54 
PR00704C 11.88 1.87le- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.^52e- 
09 169-187 


726 


PRO 019 4 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PRO 032 0 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 2.125e- 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR00320C 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PR00320B 
12.19 4.343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 


731 


PR06195 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


733 


PF0d642 


Zinc finger Ox8-C-x5-C- 
X3-H type (and similar). 


PF00642 11.59 9.082e- 
10 787-798 


738 


BL00D39 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039A 18.44 2.5S5e- 
28 25-65 BL00039D 
21.57 2.105e-20 333- 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.57le-17 353- 
383 


742 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.078e- 
12 41-81 


743 


BL00965 


Phosphomannose isomerase 
type 1 proteins. 


BL00965C 23.78 l.OOJe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 


BL00021 


Kringle domain proteins. 

• 


BT.00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.345e-2l 60-78 


748 


BL00S12' 


Osteonectin domain 
proteins . 


BL00612B 11.35 2.034e- 
11 93-126 


749 




RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-157 




sJjUU / y b 


Involucrin proteins. 


BL00795C 17,06 6.000e- 
11 384-429 BL00795C 
17.06 9.444e-lX 370- 
415 


"754 




Ribosomal protein L39e 
proteins. 


BL00051 20,92 1.935e- 
16 4-50 


"755 


nwn i q*7 n 


0 Jew ZK632.12 YDR313C 
END0S0MAL III. 


DH01970B 8.60 7.723e- 
09 171-184 


760 


BLQIO^O 


SARI family proteins. 


BL01020C 15.35 9.020e- 


762 


3L0004 6 


Histone H2A proteins. 


BL0004S 12.95 l.OOOe- 
40 33-88 


7*3 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


BL00027 


' Homeobox * doma in 
proteins . 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.063e- 
10 3 09-324 BL01208B 
15.83 8.03le-10 165- 
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SEQ ID NO: 


™ "accession - 


DESCRIPTION 


RESULTS* 








180 BL.U1208B 15.83 
4.l62e-09 85-100 


770 


~*~BL00031 


Nuclear hormones 
receptors DNA-binding 


BL00031A 19.55 9.571e- 
32 -208-241 BL00031B 
22.25 5.SO0e-27 242- 
274 


772 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13* .20 1.4'50e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 


773 




Sulzatases proteins. 


BL00523E 19.27 9.333e- 
23 299-329 BLD0523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.607e- 
13 91-103 BT,00523D 
9.89 7.923e-12 224-236 
BLC0S23C 12.64 4.512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 


775 


BL0O02 8 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.6B6e- 
09 568-585 


■774 


BL0O028 


Zinc finger, C2H2 type, ' 
domain proteins . 


BL00028 l6.0? 7.686e- 
09 621-638 


77 1 


BL0O028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL0O030 


EuKaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 8.412e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 


779 


PRO 0079 


GLUCOSE - 6 - PHCS PHATE 

DEI I YDROGENAS E SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16,65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- ' 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD002 39 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 
786 " 


BL00690 


DEAH-box subfamily ATP- 
dependent heli cases 


BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124 
BL00690C 7.51 3.189e- 
09 218-228 




PR0O449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR0O449C 17,27 8.500e- — 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PRO0449D 
10.79 l.S45e-09 111- 
125 


788 
"790 


DM01206 


CORONAVIRUS NUCLEOCAPSlD 
PROTEIN. 


DM01206B 10.69 8.76*7e- 
10 1-21 




bLO0 915 


Pbo3phatidylinositol 3- 
and 4 -kinases proteins. 


BL00915C 22.43 9.182e- 
39 725-764 BL00915B 
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SBO ID NO: 


ACCESSION 
NO. 


DESCRiPfHON 


RESULTS* 








22.78 5.050e-33 633- 
671 BL0091SD 27.02 
1.529e-21 795-831 
BL00915A 10.09 i.OOOe- 
13 395-407 


791 


PR00268 

/ 


GLIADIN AND LMW GLUTS NUIn" 
SUPERFAMILY SIGNATURE 


PR00208A 12.^9 S.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR002O8A 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR00208A 12.59 7,904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PRO 02 05 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 


BLD0412D 16.54 4.000c 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 l.B27e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


BL00021 


Kringle domain proteins. 


BL0O021B 13.33 6.339e- 
13 40-58 




799 


BL01052 


Calponin taroily repeat 
proteins . 


BL01052C 18.51 l.OOOe- 
40 87-127 BL010S2A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 S.737e-25 174- 
194 




300 


BL0D348 


p53 tumor antigen 
proteins. 


BL00348F 23.19 3.714e- 
09 197-240 




901 




Vertebrate galactoside- 
binding lectin proteins. 


BL00309C 18.65 1.621e- 
09 62-87 




802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


r nuuzy 3u JL J . ** / a.Z^fte — 

09 187-199 




804 


PF00774 


Dihydropyridine 
sensitive L-type calcium 
channel {Beta subuni. 


PP00774A 16.47 8.457e- 
10 110-156 




808 


VR00667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9. 875e- 
09 12-28 




810 


PD02346 


PHOTOS YSTEM H PROTEIN 

PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 1 
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SBQ ID NO: 


ACCSSSION 
NO. 


DESCRIPTION 


" RESULTS* 






PHOTOSYNTHESIS . 




811 


BL00685 


CBP-A/NF-YB aubunih 
proteins. 


BL00685B 14.41 6.779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERPAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL00357 


Hi stone H2B proteins. 


BL00357 7.74 1.908e-17 
22-65 


815 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PC00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-l2 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 


BL01195 


Peptidyl-tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0520 


Interleukin-10 family 
proteins. 


BL00520A £.2l 6.4^1e- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00B76 


NEMATODE METALLOTH IONE IN 
SIGNATURE 


PR00876B 7.66 2.26Be- 
10 101-115 


829 


PD02855 


FLAVOPROTEIN PROTEIN 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732c- 
28 88-124 PD02855B 
8.36 6.47Be-09 132-142 


830 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.283e- 
13 25-45 


831 


PRO 0019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3.880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3-438S- 
16 164-183 PROOOllD 
14.03 6.850e-16 164- 
183 PROOOllA 14.06 ! 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.8S2e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD003O6 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.B98e- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B ^.4$ 8.3d2e- 
09 73-116 


840 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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£eq Id no-. 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
538 


841 


PROO109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5.404e- " 
13 134-153 


844 


PD02785 


PROTEIN RIBOSOMAL 60S 
1*22 RNA- BINDING HEP. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
15.23 1.915e-28 8-57 


B45 


BLC0826 


MARCKS family proteins. 


BL00326C' 7.63 6.738e- 
09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL0051B 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR0030B 


TYPE! t ANT £ FREEZE* 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e~ 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PDD2411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.6"7 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420H 
22.67 9.625e-27 270- 
325 3L00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL0042OC 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 S.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL0042Q 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 1.000"e^ 
40 756-811 BL0D420B 
22.67 1.32le-38 966- 
1021 BLO0420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
BL00420C 11.90 1.9O0e- 
13 355-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL0042OC 
11.90 5.119e-ll 1051- 
1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


Results* 








7.95Se-iO $67-578 


857 


PR00388 


' 3 ',5 '-CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


859 


BL00030 


EuJcaxyotic RNA-binding 
region RNP-l proteins. 


BL00030A 14.39 1.^29e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


861 


PR0098B 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PRO 098 8C 
13.64 B.7l4e-16 107- 
123 PR00988F 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR00988D 
5.95 B.250e-ll 163-174 
PR009R8B 11.60 4.512e- 
10 60-72 


"863 " 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 16.44 3.071e- 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 


866 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688d 16\45 9.460e- 
09 B9-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


868 


BL01287 


RNA 3' -terminal 
phosphate cyclase 
proteins. 


BL01287A 17.95 2.68Be- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


872 


BL00045 


Kistone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL0018B 


Biotin-requiring enzymes 
attachment site 
proteins . 


BL00188 30.29 9.036e- 
32 665-711 


876 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 298-315 


877 


PD02102 


"SUBUNIT E V-ATPASB 
VACUOLAR ATP SYNTHASE 
HYDR0L . 


1 k b62102A 16". 74 4.17£e- 
10 97-141 


879 


BL01189 


Ribosomal protein S12c 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL011B9B 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


889 


BLG0216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


""§96 


PR00391 


PHOSPHATIDYL tNO&ITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39' 1.000e-13 83-104 
PR00391D 12.21 5.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR00327 


ICE NUCLBATION PROTEIN 


PR00327C 6.37 5.247e- 
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SEQ ID NO : 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


09 313-328 


898 


BL00039 


DEAD- box subfamily ATP- 
dependent hell cases 
proteins. 


BLO0039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL0003 9C 15.63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC-FINGER 
METAL-BINDI. 


PD00066 13.92 8.200e- 

15 254-267 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13.92 
8.200e-l6 310-323 
PD00066 13.92 8.2O0e- 

16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 j 
8.200e-14 338-351 


902 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.32le- 
11 6-S0 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.160e- 
09 97-111 


904 


"PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E ft. 75 6.586^ 

25 335-356 PR00381B 
IB. 17 2.667B-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3.2B8e-22 370-392 
PR00381F 9.13 7.18le- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7. 03 Se- 
ll 293-314 PR00381E 
8.75 8.364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 525-549 


907 


PRQ0345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4. 54 8.557e- 
09 513-537 


"908 


BL00679 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 9.308e-ll 
144-155 


910 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


91 7 


xaLiUllO* 


Rifcosomal protein L13e 
proteins. 


BL01104C 15.14 6.000e- 
09 364-392 


922 


3L00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3 . 842e-09 " , " 
500-511 


923 


PR00320 


G- PROTEIN BETA WD-4 0 
REPEAT SIGNATURE 


PR0032OC" 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLOROPHYLLIDB 
REDUCTASE PHOTOSYNT . 


PD02181D 12.85 8.6 09e- 
09 36-54 


924 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019C 14.6$ 7.4530- 
25 108-144 BL00019B 
13.34 6.510O-11 61-84 
BL00019D 15.33 9.338e- 
11 205-23S BL00019A 
12.56 2.373e-10 34-45 


928 


BLO0678 


Trp-Asp (WD) repeat 


BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00670 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


BLOIOBS 1 


Ribuiose-phosphate 3- 
epimeraee family 
proteins . 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 


931 


BLO1085 


Ribuiose-phosphate 3- 
epimerase family 
proteins. 


BLC1085D 16.55 4.600e- 
24 152-183 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL01O8SC 
21.81 2.038e-14 S6-97 


933 


PDO0301 


PROTEIN REPEAT MUSCLB 
CALCIUM-BI. 


PD00301A 10.24 6.400e- 
09 160-171 


936 


PF00168 


C2 domain proteins. 


PF00168C 27.49 4 .000e- 
12 336-362 


93 7 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.5l9e- 
10 5-49 


94 0 


PRO 08 62 


PROLYL OLIGOPEPTIDASE" 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


945 


BL01230 


RNA me thyltransr erase 
trmA family proteins. 


BL01230B 11.62 2.373e- 
09 407-420 


948 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 7.429e- 
18 52-68 BL00479A 
19.86 2.200e-13 25-49 


949 


BL00678 


Trp-Asp (tO> repeat 
proteins proteins. 


BL0O678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OXIDOREDUCTASE 
NAD INTERGENIC RE. 


PD01311A 3 0.23 S.9Q9e- 
10 66-111 


955 


PF00651 


BTB (also Jcnown as BR- 
C/Ttk) domain proteins. 


PP00651 15.00 3.250e~ 
12 47-60 


956 


PF00651 


BTB {also known as BR- 
C/Ttk) domain proteins. 


PP00651 15.00 3.250e- 
32 47-60 


957 


BL00379 


CDP- alcohol 

phosphatidyl transferases 
proteins. 


BL00373 24.64 1.6l0e- 
15 111-148 


959 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e- 
14 110-154 


952 


BL00O61 


Short -chain 

dehydrogenases/ reductase 
s family proteins. 


BLOO061B 25.79 6.586e- 
13 198-236 


963 


PRO 0502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 


PR00308 


TYPE I ANT I FREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


9(11 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.266e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DM0120SB 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.67le-09 38-58 


969 


PP01008 


initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01O08C 
12.25 5.333e-18 506- 
526 PF01008A 20.14 
5.875e-15 369-390 ! 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION^ 


RESULTS* 


970 


3L01277 


Ribonucleaee PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.805e-10 40-78 


975 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.122e-10 171- 
186 


9li 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


31& 


BL01167 


Ribosomal protein L17 
proteins . 


BL01167B 20.66 8.258e-'" 
19 88-127 


979 


BL00478 


LIM domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BLO0478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CAL SEQUESTR IN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


PFO0992 


Troponin . 


PF00992A 1^.^*7 8.816e- 
09 414-449 


982 


PR00299 


ALPHA CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367s- 
09 127-149 


983 


BL0115Q 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
13B 


986 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17.06 7.400a-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


BL00939 


Ribosomal protein Lie 
proteins. 


BL00939F 17.27 5.393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11. *S 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.55 4.538e- 
11 497-513 


994 


BL00027 


1 Horaeobox 1 domain 
proteins . 


BL0002 7"~26.43 2.5061T= 
25 146-189 


997 


BLO1304 


ubiH/C0Q6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- . 
09 22-39 


1000 


PRD0926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.750e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 
16.07 2.12Se-lB 24-39 
PR00926A 10.41 l.OOCe- 
15 11-25 PR00926F 
17.73 5.5o5e-09 120- 
143 


1005 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 l.OOOe-40 147-202 
BL00406D 12.58 3.7O0e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL0O406C 
6.75 l.OOOe-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


100.7 


PR00304 


| TAILLESS COMPLEX 

POLYPEPTIDE 1 
; (CHAPERONE) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4.667e-20 98-118 
PR003C4B 11.60 7.577e- 
19 68-87 PR00304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1009 


rUUil/OO 


FKOTKIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


" PD0106* 


PROTEIN ZINC FINGER 

ry y tiff** rr* -i-" t? n tut r?m^ t 

ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL0051B 


Zinc finger, C3HC4 type 
(king ringer) ; proteins . 


BL00518 12.23 6\l43e- 
10 64-73 


1016 


PD01168 


SYNTHETASE LIGASE 

DCnTVTM T\ T 7VMVT 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphogiycerate mutase 

j*cuux_jr pilUSpAQiUo £ 1QJLH6 

proteins. 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

23 .75' 8.062e-10 79-111 


1025 1 


PR00305 


X**— J — J IrKUiCLIV txCtXf*. 

SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-28 8 BL00353C 

14,03 o.B44e-ll 288- 

335 


1028 


BL00183 


Ubiqui tin-conjugating 
finzvraes n« 


BL00183 28.97 1.310e- 

JJ 4Jk— J7l 


1033 


PFOO580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e- 
us m-133 


1034 


PROQ413 


haloacid 

dehalogenase/e poxide 
hydrolase family 
signature 


PR00413E 15.78 3.429e- 

134-1/1 


1037 


PDO1066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD0I066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZrNC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


1039 


1L00299 


Ubiqui tin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PKO0970A 17.73 6.143e- 
20 56-78 PRO097OD 
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SEQ ID NO; 


ACCESSION 
NO. 


DSSCRlPTtON 


RESULTS* 






SIGNATURE 


9.96 2.l54e-l8 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e-l5 242-258 
PR00970B 16.37 1.290e- 
13 85-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR00048 


C2H2-TYPE 2INC FINGER 
SIGNATURE 


PR00048A 10.52 6.7"B6e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 
proteins . 


BL0C615A 16.68 1.720e- 
11 218-236 BL00S15B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class -I proteins. 


BL01092N 13.54 8.924e- 
10 3-40 


1047 


BL01216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins. 


BL01216D 21. 7S 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


• DM00031B 15.41 7.610e~ 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins . 


BL01073 24.30 l.DOOe- 
40 12-62 


1054 


BL00571 


Amidaeee proteins. 


<3L00571 25.69 5.B75e- 
31 160-212 


1055 


BL0OO30 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 S.235e- 
11 98-117 BL00030B 
7.03 4.316e-09 137-147 


1058 


BL00223 


Annexins repeat proteins 
domain proteins. 


BL00223C 24.79 8.754e- " 
23 262-317 BL00223A 
15.59 9.478e-14 46-80 
BL00223A 15.59 5.557c- 
11 118-152 


1060 


BL00027 


'Horaeobox* domain 
proteins . 


BL00027 26.43 3.455e- 
35 159-201 


1064 


BL0045S 


Putative AMP-binding 
domain proteins . 


BL00455 13.31 6.211e- 
13 280-296 


1065 


PR00019 


L3UCINB-RICH REPEAT j 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.880e-09 87-101 


1066 


PR00326 


GTPJL/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14- 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
23 6 


1071 


PD0287O 


RECEPTOR INTBRLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 B.SlSe- 
11 164-197 


1072 




SET domain proteins. 


PF00056A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1 /Ag5/PR- l/Sc7 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
XJ . /b b.3obe-13 b /- /b 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXYPEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp I WD) repeat 


BL00678 9.67 4.316e-09 
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SEQ tD NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULT^ 






proteins proteins . 


298-309 


1081 


BL0O326 


Tropomyosins proteins. 


BIi00326A 14.01 7.398e- 
10 23-57 


1094 


BI. 00460 


Glutathione peroxidases 
selenocysteine proteins. 


BLOO460A 28.67 3.204e- 
18 57-92 BL004SOB 
9.73 6.400e-13 100-118 
BL00460D 16.09 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


"T09S 




PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIM3RIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02B11B 
17.07 2.2S3e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PI LB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02 811C 13.25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacyl glycerol binding 
domain proteins . 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


PF00881 


Nitroreductase family. 


PF00881A 27.1s 1 9.229e- 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PRO 0405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 63-85 


1 1 i c 
JLXJ.D 




HMG14 and HMG17 
proteins . 


BL00355 5.9V 2.528e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1120 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL03107B 13.31 4.8S7e- 
10 290-306 


1123 


PR 00 412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18. 7£ 9.526e- 
12 301-324 




PRO U 186 


HEMERYTHRIN SIGNATURE 


PR001B6A 13.62 2.800s- 
09 B7-101 




■BLQ 0 1 /0 


Cyclophilin-type 
peptidyl-prolyl cis- 
trans isomerase 
signatur. 


BL00170C 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.45Se- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.^7 *.211e-d9 
29-40 


1133 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


ciatnrin adaptor 

v^winyACACa llfCUX-Lllll A Let X J -I 

proteins . 


BL00990C 18.78 4.176e- 

21.44 4.316C-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 8.000e- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 


NO. 


ur«oi_i< ± tr 1 1 UN 


RESULTS * 








32 159-188 PR00314A 
14.53 1.2Ble-22 13-34 


1139 


BL01115 


GTP-binding nuclear 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL0D107 


Protein Jcinases ATP- 
binding region proteins. 


3L00107A 18.39 4.00Ge- 
19 451-482 BL00107B 
13.31 3.077e-l2 519- 
535 


114B 


PR00685 


TRANSCRIPTION INITIATION 
PACTOR IIB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
laLiiCU± , KOTBlN IMMUNQGLOB . 


PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02&94 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00623 


GMC oxidoreductases 
proteins . 


BL00623E 15.00 3.531e- ' 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


XDVi 33 / 


DNA PROTEIN POLYMERASE 
ENDONUCLEASB DNA- . 


PD01937A 6.68 3.475e- 

09 330-341 


1162 




DNA PROTEIN POLYMERASE 
ENDONUCLEASB DNA- . 


PD01937A 6.68 3.475e-" 
09 221-232 


1163 


PRO 0624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7.45Se- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BLQ0226 


intermediate filaments 
proteins . 


BL00226B 23.86 7.384e- 
09 302-350 


JLX / / 


BIi01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 | 


1178 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320A 16.74 1 . 794e- 
10 205-220 PR00320C 
13 .01 7.840e-10 205- 
220 PR00320B 12.19 , 
B.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PRO 0 4 54 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-784 


1181 


CXLlU U Z Z> J. 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 




Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-1113 


1185 


BLC0215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 2.761e- 
10 77-93 


lie's " 


BL00878 


Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL0O878B 10.95 6.000e- 
16 189-204 BL00878C 
17.74 B.435e-15 225- 
245 BL00B78F 19.67 
3.625e-13 379-402 
BL00878D 16.56 1.621e- 
09 270-289 


1191 


PD02939 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 


1193 


PRO 03 4 5 


STATHMlN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 72-101 PR00345E 
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SEQ ID MO: 


ACCESSION 
NO. 


DESCRIPTION - 


RESULTS* 








8.54 7.652e-28 149-174 
PR0034SC 4.54 9.100e- 
28 101-125 PR0034SD 
10.97 1.964e-24 125- 
149 PR00345A 13.46 
5.645e-16 43-62 




PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00!mSb 7\l2 2.800e- 
28 108-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 
10.97 1.964e-24 161- 
185 PR00345A 13.46 
5.645e-16 79-98 


1195 


PP00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL00932 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 6. ^Be- 
ll 15-47 


1197 


BLQ1298 


D i hy drodi p i col iria t e 
reductase proteins. 


BL01296A 13.90 5.959c- 
09 51-73 


1203 


BLOOOSl 


Short-chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR0O118 


BETA- LACTAMASE CLASS A 
SIGNATURB 


PR0011BF 16.42 9.386e- 
09 213-229 


1206 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 S.295e- 
09 246-259 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins . 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-65 PF00Q23B 
14.20 l.B18e-09 45-55 


1212 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e- 
14 227-241 PR00048A 
10.52 4.316e-U 199- 
213 


1213 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR004S0C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodulin (GAP- 43) 1 
proteins. 


BL00412D 16.54 5.598e- 
10 179-230 


1219 ' 


PR00 4S6 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E ^.06 S.348e- 
11 249-264 


1222 


PDO0 0G6 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 7.231e- 
15 295-308 PD00066 
13.92 7.231e-15 406- 
419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 

363 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- " 
40 13-61 


122 6 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16-28 1.000e-40 114- 
168 BL00437C 21.86 
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SEQ ID NO; 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








1.000e-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
l.UOOe-40 327- 

379 


1230 


BL01160 


Kinesin light chain 


BL0U60B 19.54 8.297e-" 
10 5-60 


1231 


PR0073S 


GLYCOSYL HYDROLASE 

FJVMTT.V ft QTTTMU'I'I'DP 


PR00735A 11.19 6.857e~ 

AQ ^ Ol » A C 

OS 391-405 


1232 


FR00497 


NEUTROPHIL CYTOSOL 


PR00497A 6.92 5.553e- 

in i rn ■»*»/* 

10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


'Homeobox 1 domain 
proteins . 


BL00027 26.43 l.BlBe- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.184e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN AIANYL. 


PD01168L 9.47 2.83 7e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.860e-lO 
183-196 


1254 


BL00133 


Ubiqui tin- conjugating 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 S.670e- 
11 8-52 


1254 


BL00373 


Phosphor lbosylg ly ci namid 
e formyl transferase 
proteins. 


BL00373C 10.35 3.348s- 
12 143-1S6 


1258 


PRO0011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011B 13.08 3.217e- 
10 174-193 


1259 


BL00S1B 


Zinc finger, C3IIC4 type 
(RING finger), proteins. 


BL00518 12.23 8.286e- 
10 31-40 


1261 


PR0007O 


D I HYDROFOLATE REDUCTASE 
STGNATURB 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 S.SOOe- 
12 16-27 




0XiOO4o2 


Gamma - 

glu t amyl t ranspep t i dase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500C-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1253 


Oi_iVJ 


Myc-typc, 'helix- loop- 
helix' dimerization 
domain proteins. 


3L00038B 16.97 9.455e- i 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- " 
11 17-61 


1266 


rKU UoJ / 


AJjLJfciKLjcN V5/TPX-1 FAMILY 
SIGNATURE 


PR00937C 17.21 2.714e~ 
10 165-1B2 PR00837A 
14.77 4.512e-12 86-10S 
PR00B37D 11.12 7.577e- 
12 201-215 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins. 


BL00276A 8.87 l.SOOe- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO . 


PD02327C 15.47 9.769e- 
09 228-243 


1276 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 






"SIGNATURE 1 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 

1 "7 0 DBftrtil 11 *%1 

XiJ fKUUSl^A 1J.23 
3.400e-ll 100-119 


1277 


" PF0075"6 




ft UU/doU 14.12 9.538e- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins.. 


""BL00134A 11. 9* 9.325e- 


1260 


BL01220 


PhocnhA H i rlvl Ahhannl ami no 

-binding protein family 
proteins . 


oxiUl^^Ov. 14.75 9.348e- 
15 248-276 


"1285 


Bi00518 


Zinc finger, C3KC4 type 


BL00518 12.23 2.286e- 
1U 33-42 


1287 


PF00791 


Domain pxjesent in ZO-1 
and UncS-like netrin 
receptors • 


PF00791B 28.49 7.182e- 
XI 288-343 


1292 


"PR00802 


SERUM ALBUMIN FAMILY 


PR00802B 16.51 1.610e- 
10 81-105 


1297 


PR0071S 


M- PHASE INDUCGR 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-283 






Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3 .571e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15". 82 S .500e- " 
17 13-38 BL00215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


"VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.6B2e~ 
09 552-572 


1309 


PD00301 


" PROTEIN kEPEAT MUSCLE 
CALCIUM-BI . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


XJ-Li UU70J 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8,19 3.132e-09 12-22 


1313 


Bt,00194 


Thioredoxin family 
proteins. 


BL00194 12.16 1.900e- 
11 15-28 


1314 




Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 




Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.32Se- 
13 128-145 


1320 | 


BL00783 


nxovjsonwix protein _> 
proteins. 


UXj00783C 22.43 6.559e- 
24 87-117 BL00703A 
14 .33 l.o0ue-19 B-33 
BL007B3B 12.76 3.50Qe- 
12 74-86 


1327 


PF00514 


like repeat proteins. 


rJ?UtfDl4A 31.30 7.268e- 
11 82-120 


1329 


BL00030 


Eukaryotic RNA-binding 

•** ^ Avki ivi^ t x pi uucj.no • 


BL00030A 14.39 6.294e- 
-LX I^?-14o BuOU030B 

7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CY^OSQL 

FACTOR P40 SIGNATURE 


rKUU4 37A 6.92 7..239e- 
09 2S-43 


1332 


PR00161 


NICKEL -DEPENDENT 
HYDROGENASE/B - TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


"133 6 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-281 


1337 


PR00700 


PROTEIN TYROSINE 


FR00700D 12.47 2.200e- 



231 



WO 01/53312 



PCT/US00/34263 



SEQ *D NO:" 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHATASE SIGNATURE 


09 211-230 


1340 


PR00860 


VERTEBRATE 
METALLOTH IONE IN 
SIGNATURE 


PR00860A 5.46 5.034e- 
13 5-18 


1341 


BL00893 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


bir repeat proteins. 


BL012B2B 30.49 S.974e- ' 
21 383-422 


1344 


" "DM66699 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINB. 


DM00099B 14.73 8.313e- ' 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemase3 proteins. 


BL00923B 11.41 5.935e- 
10 135-146 


1348 


PF00«1 


BTB (also known as BR- 
C/Ttk) domain proteins. 


rruuoai 1j < UU 7.2jle- 

13 44-57 


1350 


PRO 01 9 3 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.571e- 
32 416-445 PR00193C 
12.60 6.318e-31 179- 
207 PR00193B 11.69 
3.571e-24 133-159 I 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


1352 


PR00447 


ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


fKuuq«/ii y.73 1.554e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
S.357e-ll 97-124 
PR00447G 6.69 9.877e- 
10 353-373 


1353 


BL00303 


binding protein. 


BL00303A 21.77 6.667e- 
26 45-82 BX>00303B 
26.15 l,000e-24 93-130 


1355 


BL00039 


DEAD- box subfamily ATP- 
proteins. 


BL00039D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 22b-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF006l£ 


Regulator of G protein 
signalling domain 
proteins. 


PF00615B Id. 25 2.216e- 
12 B4-101 PF00615C 
10.06 8.412e-12 162- 

■1 f& 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.234b- 
29 10-49 


1361 


PRO 092 5 


NONHISTONE CHROMOSOMAL 
PROTEIN HM317 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 

3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BL01372 


Glucokinase regulatory- 
protein family proteins. 


BL01272B 19 fil £ HTrto — 

30 136-171 BL01272C 

274 BL01272A 6.49 
1.231e-18 99-117 


13 63 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.23le-18 76-94 


1364 


DM00179 


w KINASE ALPHA ADHESION ' ~ 
T-CELL . 


DM00179 13.97 5.304e- 
09 167-177 


1368 


PRO 016 9 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 


PR0098 8 


JRIDINE KINASE SIGNATURE 


PR00988A £.34 1.7S-4e- 
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SEQ ID N0:~ 


ACCESSION 
NO. 




RESULTS* 








10 1-19 


1371 


BL00242 


Integrins alpha chain 


BL00242B 8.13 8.6l5e- 


1372 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.3S3e- 
19 46-67 PR0062SA 
12.84 1.3916-16 14-34 


1373 


BL00434 


HSK-type DNA-binding 
domain nrnh»inQ 


BL06434C 23.8* 3.778e- 

no on ha 
US 30-130 


1374 


PR00962 


LETHAL (2) GIANT LARVAE 


PR00952C 8.00 6.337e- 

AO CAP t 1 ^~ 

09 505-526 


13 75 


PD02475 


MUCIN EPITHELIAL TUMOR- 
& c cnr t w pi? 


PD02475A 23.18 8.5S2e- 
10 1111-1150 


1376 


PD01Q66 


PROTEIN ZiNC LINGER 
BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


1380 


BL001 94 


Thioredoxin family 
proteins. 


BL00194 12.16 8.333e- 
12 48-61 


1381 


DMO 1970 


U KVT 4Kb iZ.lZ YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 1.458e- 
15 1123-1136 


l38"3" 


UT\OnC7fl 
DJjUud / o 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD> repeat 
proteins proteins . 


BL00678 7^600e-10 
271-282 


1385 


BL00303" 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 6.203e- 
10 95-132 


170c 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 1574-1628 


1JQ / 


5L0Q51B 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
11 52-61 


1389 


PD0106S 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01O66 19.43 3.512e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 i 


PR003B0 


KINE3IN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.625e- 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13;18 6.538e-16 243- 
262 


1394 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 3.400e- ' 
14 462-475 PD00066 
13 .92 8.800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 * 
PD00066 13.92 6.087e- 
11 490-S03 PD00066 
13.92 8.043e-ll 320- 


1398 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786c- 
32 10-49 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PDO093 0A 25 62 7 324e- 
15 363-389 


1407 


BL00UJ0 


Eukaryot i c RNA- bindi ng 
region RNP-I proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


1408 


PR00019 


LEUCINE-RICH REPEAT | 
SIGNATURE 


PR00019A 11.19 9.5S0e- " 
11 179-193 PR00019A 
11.19 8.826e-l0 229- 
242 PR00019B 11.36 
1.360e-09 199-213 ! 
PR00019B 11.36 4.960e- 
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SEQ ID NO: 


ACCESSION 
NO. 




RESULTS* 


1409 


PROoUib 


NEBULIN SIGNATURE 


09 176-i9"6 

PR00510A 9.09 4.1S0e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-257 


1410 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


" BL00358 ' " 


AAXJusoniciJL proceui Lib 
proteins. 


BL00358B 22.7b l.OOCe- 
40 57-103 BL00358C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.93le- 
11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family 

f ■ L u i»cJuIlS . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PRO0S81 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PROOSOiG 12. S4 2.149e- 
09 38-60 


"1418 


DMO0973 


3 kw RESISTANCE BENOMYL, 
YLLD28W CYCLOHEXIMIDE. 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PR00319 


BETA G- PROTEIN 
{TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


i Aon 


PD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049C-30 400- 
44 7 PD01941B 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.6l4e-18 641- 
690 PD01941F 2B.S2 
S.382e-15 1038-1093 


14 22 


PR00205 


CADHERIN SIGNATURE 


PR0O2O5B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/ BETA GLIADIN 
r/iMJ.JjX SIGriATuRa 


PR00209B 4.88 6.318e- 
11 1009-1028 


1424 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BLS0002A 14.19 8.200e- 
14 367-306 BL50002A 
14.19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
J3L50002B 15.18 l.OOOe- 
09 244-258 


1425 


PF00628 


PHD- finger^ " 


PF0062B 15.84 3.045e- 

"1*5 llrt T Jr 

1Z 330-345 


1426" 


PF00628 


PHD- finger. 


PFO0628 15.84 3.045e- 
12 377-392 


1427 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR0040SB 11.83 S.114e- "' 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 5-219e- 
34 147-193 


1429 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 
1431 


PRO 037 a 

PR0D928 ( 


INOSITOL PHOSPHATASE 
SIGNATURE 

BRAVES' DISEASE CARRIER f" 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
1B6 

PR00928B 13.53 3.769e- 
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cjrri V n no • 


NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BIi01113B 18.26 7.049e- 
15 14-50 B&01113C 
13.18 7.000e-12 82-102 


1434 


PRO Q71Q 


DEtwv ij — rKU L ijJ-Ti 

(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


nr.nnm n 

DUU U UJ u 


Kuxaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 




Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
09 250-268 BL00290A 
20.89 4 .000e-09 188- 
211 


1440 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PR00806 


VINCULIN SIGNATURE 


PROOB06B 4.28 4 . 960e- 
09 88-102 | 


1444 


BL00422 


Gran ins proteins. 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01B41 


PH03 PHOR YLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 l.OOOe-40 144- 
185 PD01841D 17, 87 
l.OOOe-40 206-258 
PD01841F 13.36 1 . OOOe- 
40 296-345 PD01S41G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 l.OOOe-40 1083- 
1125 PD01841E 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01B41H 
21.30 3.189e-31 435- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01841M 10.82 1.2S0e- 
20 1175-1194 




PF00816 


H-NS his tone -family. 


PP00816B 13.84 8.875e- 
09 190-220 




PRO0048 


C2H2 -TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.080e- 
09 402-416 


id a a 


JJMOO 3 lb 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- ! 
09 23-67 


1451 


BL0003O 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR. 


DM01S88D 13.44 7.146e- 
09 332-405 


1455 


PF00777 


Si alyl t rans f erase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.08Se- 
09 42-S3 


14S(f " 


DLUU343 


Aldose 1-epimerase 
proteins. 


BL00545C 11.28 7.353e- 
17 169-182 BL00545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 

09 140-1*51 


1466 


PRQ0097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.0^9e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-214S 


1475 


PF006B6 | starch binding domain 
[ proteins. 


PF00686A 13.45 9.100e- 
09 267-277 



235 



WO 01/53312 



PCT/US00/34263 



SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


-1477 


" PFOOS66 


Probable rabGAP domain " 
proteins . 


PF00566A 12.64 7.333e- " 


1478 


BIi00030 


Eukaryotic RNA- binding 
region RNP-i proteins. 


BLOO030B 7.03 9.400e- 
10 43-53 


1479 


DM00406 


GLIADIN . 


urnuuiub /./J 8.541e-10 
292-305 


1480 


BL0 0290 


major histocompatibility 
complex proceins. 


nuwiyvii 13.17 2.385c- 
IS 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


PRO 0150 


CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.039e- 
09 21-51 


1482 


PF0078Q 


Domain fnnnH ■» « KTTv'i _ 

like kinases , mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


nxucsnj JLj-gnc cndxn, 

repeat proteins. 


BL01160B 19.54 1.153e- 
09 108-162 


1485 


PD01066 


PROTEIN ZtYtt FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 5.909e- 
2S 17-56 


1486 " 




Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- " 
09 34-50 


1488 


BL0 0039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL00166 


Enoyl-CoA 

hydra tase/i some rase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.S00e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 




Guanylate cyclases 
proteins . 


BL00452D 28.59 3.700e-' 
31 63-106 DL00452B 
11.92 3.045e-13 115- 
131 


1492 




LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 " 


B"EcyoTo7 1 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 j 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL0OO27 


•Homeobox' domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL00027 


' Home obex • doma i n 
proteins. 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 
proteins. 


BL01177B 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL0D972 - 


Ubiquitin carboxyl- 
uet uixiidx nya.roi.ases 
family 2 proteins. 


BL00972D 22. 5* 5\500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL0Q972E 20.72 8.7S9e- 
10 341-363 


1512 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


syntaxln / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BLOO6O0 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si. 


BL00600A 17.98 6".143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- | 
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WO. 


DESCRIPTION 


RESULTS* 








331 BLOOSCOG 12.43 
9.625e-l7 377-396 
BL00600B 19.60 5 . 091e- 
15 160-186 BL00600C 
16.18 6.04Ce-l2 190- 
206 BL0O6C0F 8.77 
l.O00e-ll 343-356 
BLO06O0D 8.71 l.OOOe- 
10 281-295 


152* 


PD00930 


PROTEIN GTPASB DOMAIN 
ACTIVATION. 


PDO093OB 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA WD -40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-2B7 
PR00320C 13.01 8.800e- 
09 106-121 


1*3 8 


DM0197O 


0 kw ZK632.12 YDR313C 
BNDOSOMAL III. 


DM01970B O.C0 4.508e- 
15 171-194 


153 9 


PF0O781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF0Q781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PRO0965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR00965C 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 l.OOOe- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxys terol - binding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 1 . 000e- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00O49D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1*47 


BL00951 


ER lumen protein 
retaining receptor 
proteins . 


BL00951C 19.35 l.OOOe- 
40 93-142 BLO0951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubiqui tin-activating 
enzyme proteins. 


BL0D536F 13.65 8.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 ! 


1549 


PR00139 


AS PARAG INASE /GLUTAMI NASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.119e- 
09 58-73 
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SEQ ZD NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1556 


BL00061 


Short -chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 6.276c- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 8.105e- - 
12 107-132 


1558 


BL0122 8 


Hypothetical cof tamily 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1559 


BL01228 


Hypothetical cof family 

jJtULC J* I Li} * 


BL01228D 17.44 8.l05e- 
12 107-132 


1562 


BL00522 


DMA polymerase family X 


BL00522C 11.90 6 . 600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6 . 123e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF006S1 - 


BTB (also known as BR- 
C/Ttk) domain proteins. 


^FOOD'S! IS. 00 1.947e- 
11 46-59 


1564 


Tit rtftOO o 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.61 8.594e- " 
17 184-228 BL01013C 
9.97 4.906e-12 14-24 


1567 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.400e-10 
378-389 BLO0678 9.67 
5.800e-l0 418-429 
BL00678 9.67 8.800e-10 
295-306 


157 0 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12. S7 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12. S7 6.294e- 
12 173-189 


1576 


PR00665 


OXYTOCIA RECEPTOR 
SIGNATURE 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C " 
5.89 1.000e-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00O99 


4 kw A55R REDUCTASE 

TERMINAL 

D 1HYDROPTER I D TME 


DM00099B 14.73 9.308e- '" 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins . 


BL00524A 9.65 6\776e- 

SH is 


1580 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE . 


PD02894B 13.93 6.959e- 
J.o J.0 d- £lo PD02894A 
21.96 2.125e-10 57-103 


1581 


BL00411 


*\a.i*C t> xn mutor uOulain 

proteins. 


bLUUailC 15.04 5.292e- 
12 32-54 BL00411H 
15.66 4.44le-ll 245- 
276 


1582 


PR00604 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.4 40e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.000c- 
10 225-238 


1585 


DM015 51 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE RE VERS B 
II 0RF2. 


DM01354S 11.61 7.750e- 
09 474-495 j 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1587 


PRO 00 72 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.955c- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.2B6e-24 216-239 
PR00O72D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-l9 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1589 


BL00191 


Cytochrome bS family, 
heme- binding domain 
proteins. 


BL00191H 15.64 l.S37e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7,716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 lew NUCLEAR 60-7 NUP1 
CHROMOSOME. 


DM00517B 10.96 6.625e- 
16 1175-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 


"lb92 


BL00037 


Myb DNA-binding domain 
proteins repeat proteins 
proteins. 


BL00037B 15.32 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BI,00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


1595 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00023 16.07 1.514e- 
09 110-127 


1598 


PF00628 


PHD- finger. 


PF00628 15.84 3.250e- 
11 1667-1682 


"1599 


PR00014 


FIBR0NECTIN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


1600 


BL00518 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BLOOSia 12.23 6.571e- 
10 30-39 


1602 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 S.4 02e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Tt)c) domain proteins. 


PF00651 15.00 3.571e- 
10 44-57 


1607 


BL00252 


Interferon alpha, beta 
and delta family 
proteins . 


BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.73 9.125e-16 58-103 


"1610 


DM00215 


PROLINB-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


1511 


BL00904 


Protein 

prenyl transferases alpha 
subunit repeat protcin3 
proteins . 


BL00904C 8.98 7.353e- 
10 91-125 BL0Q904D 
1.47 6.010e-09 127-168 


1612 


PF00168' 


C2 domain proteins. 


PF00168C 27.49 3.250e- 
09 365-391 


1*13 • 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 6.0Sle- 
09 932-983 BL00412D 
16.54 7,153e-09 933- 
984 




nr.nnc co 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00553I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL0C559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9-000e- 
16 266-284 


1615 


PD01427 


TRANSFERASE 
METHYLTRANSFERASB BI. 


PD01427B 22.45 3.02be- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
hep tapep tide repeat 
proteins . 


BL00115Z 3.12 7.48Se- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


161 7 


DT ft A "J ft'l 


s-ioo/icaBP type calcium 
binding protein. 


BL00303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8.754e- 
09 137-147 


1619 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METHI. 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RKODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-O9 697-709 
PR00239E 1.58 d.SBOe- 
09 702-714 PR00239E 
1.58 5.193e-09 703-715 


1622 


PR00860 


VERTEBRATE 

MBTALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.72 0e- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN PAT 
UNCOUPLING PROTBIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6.786e-23 61-93 


1631 


BL00064 


L-lactate dehydrogenase 
proteins. 


BJU00064B 23.57 l.OOOe- 
40 82-130 3L00064C 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
1.000e-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PRO 0 063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PRO0O63B 15.24 9.700e- j 
11 59-84 PROO063A 
11.71 1.614e-09 34-59 




PRO 0 23 9 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 1.105e- 
11 35-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


DijUlZlU 


Caveolins proteins. 


BL01210B 13.92 9.531e- 
10 133-183 


1637 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/C0Q5 

methyl transferase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PRO 001 5 


GRAM- POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.4*8e- 
10 128-149 


1641 


PR00320 


REPBAT SIGNATURE 


FK0Q320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PRD0320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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NO. 
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RESULTS* 








PR00320A 16.74 2.098e- 
4^9-244 


1642 


PF00023 


AnJc repeat proteins. 


PF00023A 16^.03 6.464e- 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BLQ0678 


Trp-Asp IWD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosomal protein L24 
proteins . 


BL01108A 20.33 7.356e- 
17 56-89 


1646 


PR0O380 


KtNESlN HfiAVx CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6. 30Qe-18 386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 
12.64 6.657e-15 292- 
310 


164? 


DM01242 


3 THREONINE- -TRNA 
LIGASE . 


DM01242C 17.15" 9.791e- 
37 340-381 DM01242E 
23.00 5.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8.054e- 
13 265-314 DM01242F 
10.61 7.618e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
TPR NUCLEA. 


PD00126A 22.53 5.5Q0e- 
10 13-34 


1651 


BL01160 


Klnesin light chain 
repeat proteins. 


BL0116OB 19.54 6.720e- 
11 431-485 


1652 


BL0 0933 


fggy family of 
carbohydrate kinases 
proteins . 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


1653 


BL00795 


Involucrin proteins. 


BL00795C 17.06 2.988e- 
10 70-115 


1654 


BLO0982 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 302-334 


1655 


BL0 09B2 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 282-314 


1656 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 1.391e- 
16 607-630 


"T657 




TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2,51 8.889e- 
10 442-455 


1659 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22 . 55 4 . 14 0e- 
12 376-401 BL009722 
20.72 S.629e-09 446- 
468 


1660 




Actins proteins. 


BL00406D 12.58 8.767e- 
15 188-243 


1*61 


PRO 01 05 


CYTOSiNB- SPECIFIC DNA 

METHYLTRANSFERASE 

SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.36 

A.uWUC 1U 1JUJ 


1662 


BL0G28O 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL002BO 24.61 3.172e- 
33 3119-3163 


1663 


PR00319 


BBTA G- PROTEIN - 
( TRANS DUC IN ) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-19 70-85 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1664 


BL0O01B 


EF-hanci calcium-binding 
domain proteins. 


BL0001B 7.41 S".0S0e-lO 
489-502 


1667 


" PDOlfl** 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


1669 


BL01153 


NOLl/N0P2/sun tamily 
proteins . 


BL01153D 19.69 l.l88e- 
17 11S-141 BL01153C 
13 .67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR0D678 


PI3 KINASE P65 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 3.100e- 
10 1146-1169 


1672 


BL00598 


Chromo domain proteins. 


BL00598 14.45 B.BQOe- 
20 27-49 


1673 


PR00326 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 B.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PROC049D 0.00 7.580e- 
11 343-358 PR00049D 
0.00 1.2B6e-10 342-357 


1676 


PR00747 

i 

: 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PRO0747H 12.76 B.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 

7 ^nfifl-i n in. tn 
/ .DUue-io J.J.Z- X3X 

PR00747A 14. OS 4.600e- 
17 42-63 PR00747D 
IS. 23 8.759e-17 163- 
1B3 PR00747E 15.13 
B.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13 .56 B . 714e-10 311- 
328 


1677 ™ 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


fKUUrl /n i« ■ / o a . b 3bC" 

19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR0.0747r i 17 nfi 

7.500e-18 112-131 
PR0O747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


lS'BO 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL0067e 9.67 
6.684e-09 320-331 


1681 


BL006 78 


Trp-Asp (WDJ repeat 
proteins proteins. 


BL00678 9.67 4.6C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTPl/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PRO 064 6 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6". 32 4.188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL011603 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN" P2 
SIGNATURE 


PR00456E 3.06 7.28le- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.12Se- 
10 420-435 


1692 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


l4S3 


BL00674 


AAA-protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4,46 4.000e-23 241-2T5 — 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 




PR00409 


PHTHALATB DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 cype, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL00028 16.07 5.500e- 
11 227-244 BL00028 
16.07 1. 6006-10 199- 
216 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.34Be- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
2 INC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558c- 
14 134-153 


1710 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR0OO19A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 


BL01159 


WWrapSi/fcttP domain 
proteins . 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 


PF00023 


Ank repeat proteins . 


PF00023A 16.03 7.000e- 
10 187-203 


17L3 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type {and ainiilax) . 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11. S9 9.550e- 
11 230-241 


1715 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL003S3 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuroraodulin { GAP- 43) 
proteins. 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BLQ003 8 


Myc-type, 'helix- loop- 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 8.4486- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00S67 


PROTEIN RNA-BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 41B-428 


1724 


BL01279 


Protein-L- 

ieoaspartate{D- 

aspartate) 0- 

methyl transferase signa. 


BL01279A 24.27 5.663c- 
12 233-281 


1728 


BL00018 


EF-hand calciun-binding 
domain proteins. 


BL0001B 7.41 2.059e-ll 
73-86 * BL00018 7.41 
4.176Q-11 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.£76e- 
10 296-350 


1732 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


His tone deacetylase 
family. 


PF00850F 15.70 4.349e- 
22 246-279 PF00850D 
14.76 6.850e-20 177- 
201 PF00850E 3.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL0O3 54 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DN00179 


w KINASE ALPHA ADHESION 
T- CELL . 


DM00179 13.97 5.263e- 
10 492-502 


1743 


PRO 04 4 9 


TRANSFORM ING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13 20 1 lftflp- 
11 5-27 PR00449D 
10.79 2-241e-10 109- 
123 PR00449E 13 SO 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- . 
11 5-27 PRC0449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BL00720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL0072 0B 16.57 8.297e- 
15 136-160 


1746 


PROooei 


GLUCOSE/RIB ITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR0O081B 10.38 6.727S- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BL00439 


Acyltransf erases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 B.435e- 
14 65-91 BL00439G 
13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.15Be- 
11 4-20 


1751 


PD00Q66 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD0D066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00O66 13.92 7.000e- 
13 61-74 PD00066 
13.92 6.571e-12 117- 
13 0 


1753 


BL01013 


Oxy sterol -binding 
protein family proteins. 


BL01013D 26.81 6.516e- 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 287-318 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER M3TAL- 
BINDING NU. 


PD01066 19.43 9.7S0e- 
35 10-49 


1758 


DM00406 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e-~ 
09 224-27B 


1765 


PR00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


AnK repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BL00942 


glpT family of 
transporters proteins. 


BL00942F 15.07 4.343e- 
10 371-389 BL009423 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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NO. 


DESCRIPTION 


" RESULTS* 


1778 


BL00084 


Copper bype II , 

a s corba t e - dependent 

monooxygenases proteins. 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-158 


1779 


BLO1013 


Cocyst eroi - bx riding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.801e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



+ results include in order: accession number subtype; raw score; p- value; post ion of 
signature in amino acid sequence. 
TRADOCS: 14 16223.1 (%CRJ0l LDOC) 
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TABLE 4 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 


is 


Immunoglobulin domain. 


2.1e-32 


109.5 


3 


p kinase 


Eukaryotic protein kinase 
domain 


1.3e-29 


110.7 


4 


zf-C2H2 


21 nc finger. C2H2 type 


1.6e-21 


84.9 


5 


tn3 


Fibronectrn type III domain 


0 


1097.1 


6 


fn3 


Fibronectin type III domain 


0 


1035.0 


7 


^n3 


Fibxonectin type III domain 


0 


1090.4 


8 


fn3 


Fibronectin type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


'"4e-4d" 


"146 , .7 


10 


p450 


Cytochrome P450 


9.Se-l7 


62.0 


12 


ank 


Ank repeat 


6e-20 


79.7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22.7 


IS 


zf-MtfND 


MYND finger ~" 


± . je-uo 


35.4 


It " 


Zf -MYND 


MYND fincrer 


1 . 3e-06 


35.4 


17 


zf-C2H2 


Zinc finaer fcvmf* 


1 . 7e-99 


343 . 9 


18 


CAP GLY 


CAP-Gly domain 


l ,2e-25 


98.7 


20 


IMPDH C 


titsnycuLogenase / i^ro.t' 
reductase C terminus 


1 . 6e-119 


410.5 


"21 


IMPDH" C """ 


xviif aenyarogenasc / GMP 
reauccase i_ terminus 


4 . 3e-102 


352.6 


22 


pkina.se 




2 . 4e-79 


277.0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8 .4e-74 


258.* 


25 


RNA_pol_A 


yniciaoc ai una 9UOUIUL 


0 


1077.7 


26 


Cla 


Cl_n domain 


1 . 9e-10 


44.4 


27 


3 




7 . 8e-32 


111.2 


28 


Ribosontal 1.2 
3 




le-29 


104 .2 


30 


zf-A20 


A20-like zinc finger 


1 . 5e-10 


48.5 


31 


2f-A20 


A20-like zinc finger 


I.5e-10 


48.5 


32 


PMN dh 


FMN-tieDBiitianfc dphviirna^nnco 


5 . 4e- 17 9 


608 . 1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/pID) 


3 . 8e~59 


209.9 


35 


ig 


Immunoglobulin domain 


1.4e-13 


48.8 


36 


ig 


Inununocrlobulln domain 


1 . 4e- 13 


48 . S 


40 


kinesin 


Kinesin motor domain 


o . ' fc; /© 




44 


EtS 


Bts- domain 


1 . 4e- 56 


182 . 1 


45 


Bts 


Bts- domain 


l.4e-56 


182.1 


46 


LRR 


Leucine Rich Repeat 


1 7p _ 1 -j 
X . / tl J.O 




48 


zf-C2H2 


Zinc finger, C2H2 type 


2 3e-l62 


ceo a 


49 


ITAM 


tmmunoreceptor tyrosine -based 
activation mot 


1.4e-05 




50 


UCK-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


1 . le-26 


102 . 0 


51 


UCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


1. le-26 


102.0 


52 


ras 


Ras family 


8.5e-45 


162.3 


53 


PRK 


Phosphoribulokinase 


2.1e-65 


230.7 


54 


rayb_DNA- 
binding 


Myb-like DNA-binding domain 


0.096 


15 .2 


"55"" 


voltage_CLC 


Voltage gated chloride channels 


3.3e-18S 


631.9 


56 


eugar_tr 


Sugar (and other) transporter 


0.00015 


-64.3 


57 


TBC 


TBC domain 


2.2e-37 


137.6 ~f 


58 


ank 


Ank repeat 


5.9e-25 


96\ j 


59 


ank 


Ank repeat 


5.9e-25 


96.3 


67 


PMP22_Claudi ' 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.* 


68 


C2 "f 


C2 domain 


7.9e-S4 


192.2 


69 


C2 


C2 domain 


2.3e-54 j 


194.0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 

N 


a-g 


Immunoglobulin domain 


8.2e-28 


94.7 




pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 
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SEQ ID 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






domain 








pkinase 


Eukaryotic protein kinase 
domain 


2.8e-38 


140.6 


76 


51" 

C4_Topoisom 


Topoisomerase DNA binding C4 
zinc fing 


5.4e-54 


192.8 


83 


irep t a aa B e — a 9 


Prolyl oligopeptidase family 


4.3e-10 


3^.8 


84 


rn3 


Fibronectin type ill domain 


4.1e-Sl 


183.2 


8^ 




Src homology domain 2 


3.1e-22 


~~67.7 ^ 


oo 


*9 


Immunoglobulin domain 


0.0091 


14.0 




WD40 


WD domain, G-beta repeat 


2.1e-21 


84.6 




laminin G 


l»amlnin g domain 


6".ie-27 


""98.5 


93 


AMP -binding 


AMP- binding enzyme 


2.4e-l3 


-37.2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


96 


pkinase 


Eukaryotic protein kinase 
domain 


2.6e-Sl 


183.9 


97 


adh short 


short chain dehydrogenase 


2e-61 


217.5 


98 


kineoin 


Kinesin motor domain 


2.2e-86 


300.4 


101 


IRS 


PTB domain (IRS-1 type) 


5.4e-36 


133.0 


102 


AAA 


ATPases associated with various 
cellular act 


<5.8e-0!i 


-5.2 


104 


pkinase 


Eukaryotic protein kinase 
domain 


2.7e-73 


256.5 


106 


ras 


Ras family 


8.3e-24 


92.5 


107 


FYVE 


FYVE zinc finger 


5.4e-27 


100.7 


108 


Cyt_reductas 

6 


FAD/ NAD -binding Cytochrome 
reductase 


7.7e-61 


215.5 


109 


zf-C2H2 


Zinc finger, C2H2 type 


2 .3e-122 


420.0 


113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


"306.2 


116 


PH 


PH domain 


3.1e-ll " 


45.2 


117 


lipocalin 


Lipocalm / cytooolic fatty- 
acid binding pr 


2.4e-14 


53.5 


118 


pkinase 


Eukaryotic protein kinase 
domain 


4 .5e-20 


76.3 


120 


WD40 ~] 


WD domain, G-beta repeat 


2.4e-14 


61.1 


121 


WD4 0 


wd domain, G-beta repeat 


2.4e-14 


61.1 


123 


IF5_eIF4_eIF 
2 


elF4-gamma/eIF5/eIF2-epsilon 


le-32 


122.2 


124 




Immunoglobulin domain 


6 . 5e-08 


30. £ 


127 


raito_carr 


Mitochondrial carrier proteins 


3e-16" 


58.6 


128 


PP2C 


Protein phosphatase 2C 


2.2e«71 


250.6 


129 


ATP1G1JPLM_M 
AT 8 


ATP1G1/PLM/MAT8 family 


3.1e-20 


80.6 


130 


pfkB 


pfkB family carbohydrate kinase 


4 .5e-42 


137.1 


133 


ACBP 


Acyl CoA binding protein 


4 .6e-22 


86.7 


1 J% 


rrro 


RNA recognition motif. "~ 


1.2e-31 


118.5 


13 5 


IQ 


IQ calmodul in-binding motif " 


2 .6e-08 


41.0 


IJ D 


ATP 1G1_PLM_M 
AT8 


ATP1G1/P]>I/MAT8 family 


9.3e-22 


8*. 7 


"139 


WH2 


Wiskott Aidrich syndrome 
homology region 2 


0.0067 


23.1 


14 0 


zf -C2H2 


Zinc finger, C2H2 type 


1.7e-82 


287.5 


141 


Peptidase S2 
6 


Signal peptidase I 


5.7e-10 


3*. 7 


14 3 


an 


ADP-ribosylation factor family 


1.2e-39 


145.2 


146 


KRAB 


KRAB box 


7.3e-30 


112.6 


148 


DUF6 




0 . 096 


8.0 


149 


PDEase 


3 '5 '-cyclic nucleotide 
phosphodiesterase 


3.8e-80 


231.1 


_1 _ 


S4 


S4 domain 


l.le-oa 


42.3 


~153 


tRNA-synt_ld 


tRNA synthetases class I {R) 


3.8e-103 


356.1 


"154 


Cyt_reductas 
e 


FAD/NAD- binding Cytochrome 
reductase 


7.8e-60 


212.2 


155 
"157 


ras 

actin i 


Kas family 
\ctin 


3.6e-28 
3.6e-26 ( 


107.0 
B7.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 


158 


Jacalin 


Jacal in-like lectin domain 


0.09 


-24 .9 


160 


zn_carbopept 


Zinc carboxypeptidase 


Se-138 


471 .9 


165 


pkinase 


Eukaryotic protein kinase 
domain 


S.le-67 


236.1 


1*7 


zf-C3ilC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-07 


27.0 


168 


Ribosotnal si 
5 


Ribosomal protein S15 


l.le-06 


29-0 


169 


DEAD 


DEAD/DEAH box helicace 


le-48 


157.0 


171 


DUF59 


Domain of unknown function 
DUF59 


0.07 


-17.4 


172 


pkinase 


Eukaryotic protein kinase 
domain 


3.7e-15 


58.6 


173 


globin 


Globin 


4.6e-18 


67.4 


174 


WW 


WW domain 


7.3e-06 


32 .9 


175 


ras 


Ras family 


le-31 


11 B .8 


176 


ATP1G1_PLM M 
AT 8 


ATP1G1/PLM/MAT8 family 


2.Ze-l.1 


71.0 


179 


Zf-C2H2 


Zinc finger, C2H2 type 


l.Se-99 


344.2 


1B0 


Clq 


Clq domain 


8.8e-72 


251.9 


190 


Yjphosphatas 
e 


Protein-tyroslne phosphatase 


4.9e-287 


967.0 


191 


efhand 


EF hand 


7.5e-16 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-82 


285.6 


194 


bromodomain 


Bromodomain 


5.8e-31 


111 . 4 


195 


PALP 


Pyridoxal- phosphate dependent 
enzyme 


2.5e-64 


227.1 


i$1 


DnaJ 


DnaJ domain 


1.6e-3B 


141.4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0. 00018 


16". 9 


200 


acid phospha 
t 


Histidine acid phosphatase 


2.5e-10 


37 ,~ ' " 


201 


" WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.00048 


26.9 


204 


vATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1.3e-159 


543.7 


205 


VATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1.6e-139 


47*. 9 


206 


ldl_recept_a 


Low- density lipoprotein 
receptor domain 


2.4e-2S 


97.6 


209 


ank 


Ank repeat 


1.4e-19 


78.4 


210 


Rhomboid \ 


Rhomboid family 


0.0035 


1.2 | 


211 


Clq 


Clq domain 


1.6e-70 


247.7 


212 


UQ_con 


Ubi qui tin -conjugating enzyme 


7.4e-74 


258.8 


213 


UQ_con 


Ubi quit in -conjugating enzyme 


le-S3 


191.9 


215 


DEAD 


DEAD/DEAH box helicase 


1.8e-43 


140.4 


216 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5e-21 


83.4 


218 


Glycofl trans 
f 2 


Glycosyl transferases 


4e-2i 


83.5 


219 


ig 


Immunoglobulin domain 


0.092 


10.7 


222 


WD40 


WD domain, G-beta repeat 


7.4e-23 


89.4 


224 


TPR 


TPR Domain 


1.2e-08 


42.1 


225 


DnaJ CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


226 


DnaJ_OCXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141. S 


229 


HSP70 


Hsp70 protein 


2.4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


3.4e-47 


170.2 


231 


tsp_l 


Thrombospondln type 1 domain 


0.0075 - 


17.1 


233 


cyclin 


Cyclin 


4 .6e-144 


492.0 


234 


ras 


Ras family 


4.8e-50 


179.7 


235 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 " 


236 


LRR 


Leucine Rich Repeat 


6.7e-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGP) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORB 


244 


dCMP_cyt_dea 
ra 


Cytidine and deoxycytidylate 
deaminase 


2.5e-0* 


31.1 


245 


ig 


Immunoglobulin domain 


6.7e-08 


30.5 


240 


wnt 


writ family of developmental 
signaling protei 


9.1e-270 


742.6 


250 


mito carr 


Mitochondrial carrier proteins 


1.3e-55 


193.6 


254 


adenylatekin 
ase 


Adenylate kinase 


1.8e-14 


55.7 


255 


Cation_efflu 

X 


Cation efflux family 


2.8S-33 


124.0 ■' 


256 


SH3 


SH3 domain 


3 .9e-14 


60.4 


257 


Aa__trans 


Transmembrane amino acid 
traneporter protein 


2.6e-52 " 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2.1e-110 


380.2 


259 


HIT 


HIT family 


8.2e-07 


25.3 


260 


Bacterial PQ 

Q 


PQQ enzyme repeat 


i.6e-15 


65.0 


262 


proteasome 


Proteasome A- type and B- type 


6.5e-64 


225.7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6 .3e-27 


101.0 


270 


filament 


intermediate filament proteins 


3 .2e-150 


S12.5 


27Z 


Choline_kina 
se 


Cholme/ethanolamine kinase 


2e-67 


237.4 


277 


Ribosomal S7 


Ribosomal protein S7p/S5e 


3.3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3.36-7-) 


259.9 


280 


WD40 


WD domain, G-beta repeat 


7.8e-73 


255.4 


291 


WD4 0 


WD domain, G-beta repeat 


7.8e-73 


255.4 


284 


zf-DHHC 


DHHC zinc finger domain 


4.6e-24 


93.4 


287 


Exonuc lease 


Exonuclease 


1.4e-67 


23 8.0 


291 


"SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 


294 


Zf-C2H2 


2inc finger, C2H2 type 


1 .4e-29 


111.7 


295 


Zt -C2H2 


Zinc finger, C2H2 type 


2,2e-125 


430.0 


'296 


xnito_carr 


Mitochondrial carrier proteins 


4.1e-59 


205.5 


297 


HMG_bcoc ~ 


HMG (high mobility group) box 


6.7e-29 


109.4 


302 


Glycos tran3 
f_4 


Glycosyl transferase 


5e-87 


302. £ 


304 


tRNA-synt_2 


tRNA synthetases class II (D, K 
and N) 


l.le-84 


294.8 


305 


KRAB 


KRAB box 


2e-44 


161.0 


306 


rrm 


RNA recognition motif. 


2.7e-44 


160.6 


308 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


£.2e-39 


126.1 " 


309 


DNA_p o 1 yme r a 
seX 


DNA polymerase X family 


2.4e-64 


227.2" 


311 


F-box 


F-box domain. 


9.5e-08 


39.2 


312 


ig 


immunoglobulin domain 


6.8e-19 


65.9 


313 


BtS 


Ets -domain r 


8.1e-60 


192.3 


31b 


Kelch j 


Kelch motif 


1.3e-106 


367.6 


317 


arf 


ADP-ribosylation factor family 


3.2e-35 


130.4 


318 


sugar_tr 


Sugar (and other) transporter 


0.0003 


-73.1 


320 


pkinase 


Eukaryotic protein kinase 
domain 


8.1e-83 


288.6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4.9e-81 


282.6 


324 


Xlink 


Extracellular link domain 


4.Se-143 


331.5 


326 


ARID 


ARID DNA binding domain 


5.1e-37 


136.4 


327 


HMGjbox 


HMG (high mobility group) box 


6.7e-29 


109.4 


32B 


cadherin 


Cadherin domain 


8.1e-81 


281.9 


r331 


chromo 


' ciirorao ' ( CHRroma tin 
Organization Modifier) 


4e-18 




333 


Peptidase M2 
2 


Glycoprotease family 


1.2e-136 


467.4 
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SEQ ID 
NO: 


PFAM tlAtoE 


DESCRIPTION 


p -value 


PFAM " 
SCORB 


335 " 


vwa 


von Wiliebrand factor type A 
domain 


2.3e-07 


37.9 


339 


ras 


Ras family 


7. 8e-07 


-59.1 


340 


ZC-C2H2 


Zinc finger, C2H2 type 


8.2e-64 


"22^.4 


342 


z£-C2H2 


Zinc finger, C2H2 type 


2.4e-85 


297.0 


343 


*9 


Immunoglobulin domain 


0.0005 


is.o 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6-.te-6^ 


229.1 


351 


EGB* 


EGF-like domain 


8.5e-20 


75.2 


352 


auk 


Ank repeat 


2.5e-10l 


350.0 


354 


TiJC 


TBC domain 


5.1e-15 


63.3 


355 


PHD 


PHD- finger 


3.2e-07 


37 .4 


358 


DUF6 


Integral membrane protein DUF6 


" 0.033 


15.8 


359 


Zf-C2H2 


Zinc finger, C2H2 type 


7.4e-20 


79.4 


361 


ank 


Ank repeat 


6.6e~34 


126.1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4.7e-53 


189.7 


363 - 


efhand 


EF hand 


5 ,4e-10 


46.6 


367 


LRR ■ 


Leucine Rich Repeat 


8.8e-44 


158.9 


368 


lamirxin G 


Laminin G domain 


1 .5e-33 


121.7 


369 


pp'ic 


Protein phosphatase 2C 


5 .3e-20 


73.9 


3 72 


LIH 


LIM domain containing proteins 


9.9e-15 


57.1 


373 


KRAB 


KRAB box 


4.8e-23 


90.0 . 


3 76 


ion_trans 


Ion transport protein 


2.9&-Q9 


-4.2 


377 


Beach 


Be ige/ BEACH domain 


4 .9e-208 


704 .5 


380 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-94 


327.5 


381 


AMP- binding 


AMP-binding enzyme 


1.4e-07 


-140.3 | 


382 


HECT 


HECT- domain (ubiquitin- 
transferase) . 


1.3e-07 


-13.5 


384 


ank 


Ank repeat 


2.5e-101 


350.0 


386 


19 


Immunoglobulin domain 


9.5e-Q5 


23 .6 


388 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154.6 


389 


ig 


Immunoglobulin domain 


2.8e-15 


44.3 


390 


mi to carr 


Mitochondrial carrier proteins 


3.5e-67 


233.2' 


392 


TPR 


TPR Domain 


6.1e-17 


49.7 


393 


SH3 


SH3 domain 


3.5e-09 


43.9 


394 


AAA 


ATPases associated with various 
cellular act 


4.1e-2l 


83.6 


396 


spectrin 


Spectrin repeat 


2.1e-67 


237,3 


397 


zf-C2H2 


Zinc finger, C2H2 type 


0.0066 


23.1 


399 


fn3 


Fibronectin type III domain 


4.1e-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0.00049 


26. 8 


401 


El dehydrog 


Dehydrogenase El component 


3e-119 


409.6 


402 


£n3 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2 .ie-10 


48.0 


405 


cadherin 


Cadherin domain 


8.ie-8i 


281.9 


406 


z£-cxxc 


CXXC zinc finger 


5e-15 


63.4 


410 


Ty \_. ^ <1 y* t*» 

RnoGEF 


RhoGEF domain 


l.le-23 


92.1 


411 


F-box 


F-box domain. 


4.2e-06 


33.7 


412 


SNF2__N 


SNF2 and others N- terminal 
domain 


5.8e-16 


61.6 


415 


CPSa s e^Ii^cha 
in 


carbamoyl -phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3 .8e-24 


93.6 


419 


DENN 


ueiDivi iaoa-j/ domain 


2e-5S 


207. S 


420 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G-patch 


G-patch domain f 


le-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


piexin repea 
t 


Piexin repeat 


0.0023 


24.6 


427 


plexin_repea 


Plexan repeat 


0.0023 


24.6 
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SEQ ID 
r*\j : 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 




t 








429 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


B.6e-ll 


39.2 




DEAD 


DEAD/DBAH box heiicase 


le-66 


214.0 ~ 


432 


SH3 


SH3 domain 


3.4e-16 


67.2 


4 J J 


GTp CDC 


Cell division protein 


2.1e-ii4 


393.5 


436 


collagen 


Collagen triple helix repeat 
(20 copies) 


4.6e-194 


■"658.1 " 


438 


Ricin_B_lect 
in 


Similarity to lectin domain of' " 
ricin b 


O.OO^S 


10.5 


441 


Aipha_adapti 
n__C 


Alpha adapt in carboxyl- terminal 
domai 


1.2e-256 


866.0 


442 


Alpha adapt i 
n C 


Alpha adaptm carboxyl -terminal 
domai 


1.8e-235" " 


795.7 


443 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.9e-65 


"'230.9 - 


44S 


LON " 


ATP-dependent protease La (LONJ 
domain 


0.00012 


-17.1 


446 


ig 


Immunoglobulin domain 


O.OOOll 


20.1 


t 45l 


sushi 


Sushi domain (SCR repeat) 


1.4e-18 


7S.2 


452 


£n3 


Fibronectin type III domain 


1 .5e-06 


35.2 


454 


pyridoxal_de 
C 


Pyridoxal - dependent 
decarboxylase conse 


8 ,3e-14 


50.3 


456 


kinesin 


Kinesin motor domain 


4 . 9e-217 


734.4 


457 


neur_chan 


Neurotransmitter-gated ion- 
channel 


le-175 


597.1 


458 


Josephin 


Josephin 


0.0002 


IB. 7 


468 


bZIP 


bZIP transcription factor 


1.7e-07 


31.8 ~~ 


470 


NTP_transter 
ase 


Nucleotidyl transferase 


6.3e-06" 


-26.3 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 


473 


LIM 


LIM domain containing proteins 


0.00021 


20.7 


477 


zf-RanBP 


Zn- finger in Ran binding 
protein and others. 


0.023 


21.0 


479 


WD40 


WD domain, G-beta repeat 


6.5Q-18 


73.0 


480 


KRAB 


KRAB box 


le-31 


118.8 


481 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


8.4e-66 


232.0 


| 485 


SH2 


Src homology domain 2 


0.011 


11.4 


486 


Clq 


Clq domain 


4 .3e-74 


"259.6- 


487 


dsrm 


Double- stranded RNA binding 
motif 


l.le-47 


171.9 


489 


zf-C2H2 


Zinc finger, C2K2 type 


4.8e-153 


521.9 


490 


Alpha_adapt i 
n_C 


Alpha adaptin carboxyl -terminal 
domai 


3.4e-222 


751. £ 


492 


~s~ki 


ShiJcimate kinase 


l.2e-10 


48 .8 


497 


ENV_polyprot 
em 


ENV polyprotein (coat 
polyp rotein) 


2.6e-22 


77.6 


438 


abhydrolase_ 
2 


Phospholipase/Carboxylest erase 


0.041 


-4 8.1 


500 




kjua recognition motif. 


5.4e-34 


126.4 


SOI 


WW 


WW domain 


4.6e-18 


73.4 




ig 


immunoglobulin domain 


l.le-io 


39.5 


504 

t- n c -- 


abhydxolase 


alpha/beta hydrolase fold 


0.044 


-3.6 




vwa 


von Willebrand factor type A ~ ' 
domain 


7.le-62 


219.0 




Na_K_ATPase 
C 


Na+/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 


Exonuc lease 


ciAuuuViicaoc 


1 .3e-56 


201.5 


510 


Giycos trang 
f_l 


Glycosyl transferases group 1 1 


2.9e-06 


27.0 


511 


Giycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-0<S 


27.0 


512 


Giycos trans 
fj. 


Glycosyl transferases group 1 


l.Se-09 


38". 5 


514 


pro_i someras 
* i 


Cyclophilin type pep t idyl - 
prolyl cis-tr 


1.8e-63 


221.4 
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SEQ ID 
WO: 

c 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


51b 


EGF 


EGF- like domain 


1.9e-18 


74 .7 


516 


Surp 


Surp nodule 


4.3e-38 


140.0 


523 




Immunoglobulin domain 


3.3e-06 


25.0 


526 


UBX 


UBX domain 


l.le-34 


128.6 


528 


adh_zinc 


Zinc-binding dehydrogenases 


2.7e-34 


127.4 


CO ft 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh_short 


short chain dehydrogenase 


0.0025 


-34.1 


532 


mito carr 


Mitochondrial carrier proteins 


2.5e-BI. 


281.7 


533 


mito carr 


Mitochondrial carrier proteins 


2e-6l 


213.5 


534 


thiolase 


Thiolase 


3.5e-183 


622 .0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


1153.7 


536 


SCAN 


SCAN domain 


4e-55 


196\6 


537 


tRNA-synt_l 


tRNA synthetases class I Jl, h, ' 
M and V) 


3.1e-136" 


466.0 


538 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


46^. 0 


539 


tRNA-synt_l 


tRNA synthetases class I (i, L, 
M and V) 


1.9e-H7 


403.6 - 


~540 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.le-136 


466.0 


541 


vATP- synt_E 


ATP synthase (E/31 kDa) subunit 


5.9e-85 


295.7 


543 


ZC-C2H2 


Zinc finger, C2H2 type 


5.5e-69 


242. £ 


544 


DUF101 


Protein of unknown function 
DUF101 


8.5e-38 


139.0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


l.le-67 


238.2 


547 


WD40 


WD domain, G-beta repeat 


2.6e-32 


126.8 


548 


RHD 


Rei homology domain (RHD) . 


•1.6e-238 


686.2 


549 


MMR kSRl 


GTPase of unknown function 


5.4e-67 


236.0 


551 


HECT 


HECT-domain (ubiguitin- 
transferase) . 


4.3e-127 


435.6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3.5e-74 


259.8 " 


555 


zt-UBRl 


Putative zinc finger in N- 
recognin 


3.3e-l<S 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 


109.7 


561 


AMP -binding 


AMP- binding enzyme 


2.8e-06 


-163.7 


562 


PABP 


Poly- adenylate binding protein, 
unique domai 


4.9e-38 


139.8 


564 


Gag_p3 0 


Gag P30 core 3hell protein 


1.2e-67 


238.2 


566 


PWWP 


PWWP domain 


8.1e-l6 


66.0 


567 


SCAN 


SCAN domain 


7.3e-68 


238.9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


"570 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


571 


CN_hy drol a s e 


Carbon- nitrogen hydrolase 


0 .00081 


-79.7 


572 


myosin_head 


Myosin head (motor domain) 


0 


1495.2 


dog- 


myosin_head 


Myosin head (motor domain) 


0 


1490.4 


b /b 


Surp 


Surp module 


1.7e-23 


91. S " 


TTt 


Surp 


Surp module 


1.7e-23 


91.5 


b / / 


DNA_pol_B 


DNA polymerase family B 


0 


113 8.6 


O Jo 


PDZ 


PDZ domain (Also known a 3 DHR 
or GLGF) . 


B.3e-09 


42.7 




■uKK 


Leucine Rich Repeat 


4.9e-21 


83.3 


580 




n\,uiu\,tai[auuLLCi yaLcu loll" 

channel 


5 . 9e-l77 


601 .3 


583 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


584 1 


DEAD 


DEAD/DEAH box helicase 


7.3e-36 


116.3 


586 


kh- domain 


KH domain 


2.9e-13 


57 .5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


589 


LIM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


bromodomain 


Bromodomain 


6.6e-32 


114.7 


591 


bromodomain 


Bromodoraain 


6.6e-32 


114.7 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p- value 


pfaM" 

SCORE 


592 


hormone_rec 


Ligand- binding domain of 
nuclear hormone 


3.5e-22 


87.1 


£9l 




PHD- finger 


3.8e-12 


53.8 


594 


cadherin 


Cadherin domain 


4.2e-99 


342 .7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319 .2 


597 


WD40 


WD domain, G-beta repeat 


0.00054 


26.7 


600 


FG-GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma -adopt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2 .3e-86 


300 .4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8G-42 


152.4 


606 


raito_carr 


Mitochondrial carrier proteins 


6. 3e-67 


232.3 • 


608 


PWWP 


PWWP domain 


2. 6e-2B 


107.5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAP GLY 


CAP - Gly domain 


0 . 0046 


20,1 


615 


RFX_DNAjOind 


RFX DNA-binding domain 


5.2e-54 


192.9 


616 




IT i nps i n mot" ht rimiair 


1 . le-81 


284 • 8 


617 " 


kineoin 


Kinesin motor domain 


8.4e-80 


278.5 


618 




d j.nc i iJiy er f * — i type {Kxftiy 
finger) 




13 . 1 


620 


MATH 


MATH domain 


7.8e-05 


22.2 




Y phospha t as 

e 


Protein- tyrosine phosphatase 


1 . 4e-32 


121 . 6 


622 




E.uJvcixyo nc protein .tinaae 


4 . 4e-40 


146 . 6 


623 


BNR 


BNR repeat 


2 .18-11 


51.3 


624 


luviyoav/pbcLi 


riosaryotic iuoxyiJCLopter3.n 


1 . 4e-12 


42 m 2 


625 


TPR 


TPR Domain 


l.le-17 


72.2 


627 




rlrwisi "i n 

UUItlGIAtt 


3 7e _ 58 




630 


adh short 


short chain dehvdT*ocfpn»fli» 


Se-17 


70.0 


£31 


zf -C2H2 


Zinc finger, C2H2 type 


2.1e-88 


307.1 


632 








in e; 


635 


pkinase 


Eukaryotic protein kinase 

UWIIMIJUl 


1.6e-104 


360.7 


636 


Fork h^Pfi 


Fork head domain 


5 . 9e-27 


103.0 


637 


pkinase 


domain 


3 . 8e-70 


C *i t> - -> 


642 


TPR 


TPR Domain 


4.8e-08 


40.1 


643 


ef hand 


EF hand 


1 . 9e-27 


104 . 6 


£47 


SNF2_N 


SNF2 and others N- terminal 
domain 


1 . 2e-101 


351 .1 


64B 


PseudoU synt 

.h 2 


RNA pseudouridylate synthase 


1.9e-55 


197.6 


650 


zf-C2H2 


Zinc finger, C2H2 type 


0.0087 


22.7 


651 


ank 


Ank repeat 


1.3e-17 


71.9 


652 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


653 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4.1e-171 


581.8 


£54 


tsp_i 


Thrombospondin type i domain 


4.1e-47 


169.9 


659 


FH2 


Formin Homology 2 Domain 


le-107 


371.2 


££l 


pou 


Pou domain - N- terminal to 
homeobox domain 


5.3e-45 


162.9 


662 


C2 


C2 domain 


6.7e-19 


76 .2 


663 


C2 


C2 domain 


6.7e-19 


76.2 


664 


C2 


C2 domain 


6.7C-19 


76 .2 


6*7 


GST 


Glutathione S- transferases. 


9.3e-34 


114 .4 


6*8 


LRR 


Leucine Rich Repeat 


9.3e-31 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203.2 | 


671 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


£72 


ABC_tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 


WD domain, G-beta repeat 


4.8e-24 


93.3 
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"SEQ ID 

NO; 


~ PPAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


575 


wd46 


WD domain, G-beta repeat 


4.8e-24 


" 93.3 


676 


LRR 


Leucine Rich Repeat 


0.0015 


25.2 


679 


zf-CCCK 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.6e-29 


107.7 


680 


z£ -C2H2 


Zinc finger, C2H2 type 


5.2e-05 


30.1 


cot 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


682 


"DSPc 


Dual specificity phosphatase, 
catalytic doma 


4.3e-43 


156.6 


6B3 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.051 


10.8 


687 


Synapsin 


Synapsin 


0 


1890.8 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


0 


1038.8 




homeobox 


Homeobox domain 


8.5e-30 


112.4 


696 


Peptidase_M2 
4 


metallopeptidase family M24 


2.6e-59 


210.5" 


697 


RKoGEF 


RhoGEF domain 


9.5e-35 


12B.9 


698 


PHD 


PHD- finger 


0,008 


9.3 


701 


z£-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422.0 


702 


Sul£ata3e 


Sulfatase 


3e-231 


781.6 


703 


ZI-C2H2 


Zinc finger, C2H2 type 


5.7e-20 


79.8 


707 


Acyl_transf 


Acyl transferase domain 


l.le-22 


88.8 | 


708 


WD4 0 


WD domain, G-beta repeat 


4 .8e-is 


76.7 


710 


Ran_BPl 


RanBPl domain. 


B.4e-06 


-7.3 


713 


DEAD 


DBAD/DEAH box helicase 


9.9e-42 


134.9 


714 


PH 


PH domain 


1.6e-09 


39.0 


715 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1.5e-37 


138.2 


717 


Sialyl trans £ 


Sialyl transferase family 


7.5e-31 


115,9 


718 




Immunoglobulin domain 


le-29 


100. 8 


719 


integrin_B 


Integrms, beta chain 


0 


1125.4 


720 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 


722 


Peptidase_C2 


Calpain family cysteine 
protease 


3e-14S 


495.9 


723 


ig 


Immunoglobulin domain 


2.2e-0S 


22 .4 


724 


F-box 


F-box domain. 


0.007 


23.0 


725 


Nop 


Putative snoRNA binding domain 


8 .le-58 


205.5 


726 


Nop 


Putative snoRNA binding domain 


8.1e-58 


205.5 


727 


WD40 


WD domain, G-beta repeat 


7.5e-26 


99.3 


730 


dsrm 


Double- stranded RNA binding 
motif 


0.027 


12.1 


731 


dynamin 


Dynamin family 


4.2e-16 


66.9 


733 


Zf-CCCH 


Zinc finger C-xB-C-x5-C-x3-H 
type 


2.8e-i0 


41.7 


735 


OH P traxiflf 


CDP-alcohol 

phosphatidyl transferase 


4.2e-26 


100.1 


738 


DEAD 


DEAD/DEAH box helicase 


8.6e-57 


"182. 5 


* 




TSC-22/dip/bun family 


£.Se-32 


119.5 


742 


ras 


Ras family 


2.2e-100 


346.9 


743 


fM±_typei 


Phosphomannosc i some rase type I 


1.2e-243 


822.9 


747 


trypsin 


Trypsin 


6.4e-88 


279.4 


748 


kasal ! 


Kazal-type serine protease 
inhibitor domain 


2.2e-52 


187 .4 


"7<;t 


etnand 


EF hand 


6.3e-06 


33 .1 


/o JL 


PHD 


PHD- finger 


4,9e-16 


66.7 




zf-C2H2 


Zinc finger, C2H2 type 


3 .2e-21 


83 .9 


753 


Hydrolase 


-v w«*v»a. w uciiaiUJCJiaac llAC 

hydrolase 


6 .le-H 


4978 


7^4 


Ribosomal L3 
9 


Ribosomal L39 protein 


0.0001B 


26.7 


755 


PH " 


PH domain 


3-6e-14 


55.7 


758 


SCAN 


SCAN domain 


l-4e-53 


191.5 


"759 


PA 


^A domain 


0.0065 


23.1 


7^0 


art" 


ADP-ribosylation factor family " 


2.2e-l9 


77.8 


761 


CIDB-N 


CIDE-N domain " "~ " 


2.2e-40 


147^.5 
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SEQ ID 
NO: 


" PFAM NAME 


DESCRIPTION 


p- value 


SCORB 


762 


hi stone 


Core histone H2A/H2B/H3/H4 


9.9e-53 


188.6 


763 


zr-M^ND " 


MYND finger 


4.1e-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188.6 


767 


vwc 


von Willebrand factor type C 
domain 


2.9e-34 


127.3 


769 


efhand 


EF hand 


4 .8e-ll 


50.1 


770 


z£-C4 


Zinc finger, C4 type (two 
domains) 


2.4e-53 


181.6 


772 


ras 


Has family 


7e-90 


312.0 


773 


Sulfa tase 


Sultatase 


le-142 


487.5 


77S 


sf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


776 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


111 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55". <J 


11B 


rrm 


RNA recognition motif. 


2.1e-32 


121.1 


119 

! 


G6PD 


Glucose- 6- phosphate 
dehydrogenase 


1.5e-76 


236.6 


i 780 


spectrin 


Spectrin repeat 


3.7e-29 


110.3 


781 


mito^carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PD2 


PDZ domain (Also known as DHR 
or GLGF) . 


4.1e-07 


37.1 


785 


DEAD 


DEAD/ DEAH box helicase 


6e-Q6 


21.7 


786 


ras 


Ras family 


5.3e-39 


143.0 


787 


RNase HII 


Ribonuclcase HII 


2.5C-67 


237.1 


790 


PI3_Pl4_kina 
se 


Phosphatidylinositol 3- and 4- 
kinases * 


5.4e-108 


372.2 


795 


cadherin 


Cadherin domain 


2 .5e-40 


147.4 


796 


ARID 


ARID DNA binding domain 


1.6e-20 


81.6 


757 


trypsin 


Trypsin 


9.9e-20 


64.8 


799 


CH 


Calponin homology (CH) domain 


3 .7e-15 


63.8 


801 


Gal- 

blnd lectin 


Vertebrate galactoside-binding 
lectin 


4.1e-25 


88.7 


803 


WD40 


WD domain, G-beta repeat 


O.000B2 


26\l 


806 


TBC 


TBC domain 


l.Se-26 


101.4 


807 


TBC 


TBC domain 


i.se-26 


101.4 


808 


CN hydrolase 


Carbon- nitrogen hydrolase 


8.8e-80 


'278.5 


811 


CB'FD . NFYB M • 
P 


His tone-lake transcription 
factor 


fie- 14 


59.8 


812 


adh short 


short chain dehydrogenase 


B.le-20 


"79.3 


814 


IMP4 


Domain of unknown function 


3.3e-71 


250.0 


815 


Zt-C2H2 


Zinc finger, C2H2 type 


B.2e-6S 


232.1 


B16 


Pept_tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 


1.6e-37 


138 .0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74 . 3 


826 


IF5_eIF4 elF 
2 


e IF4 - gamma / e IF5/e I F2 - eps i 1 on 


1.6e-32 


121.5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191.3 


831 


LRR 


Leucine Rich Repeat 


2.1e-26 


101.1 


832 


laminin_EGF 


Laminin EGF-like (Domains III 
and V) 


2e-57 


204.2 


839 


rrra 


RNA recognition motif. j 


1.3e-22 


88.5 


"840 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


2.60-119 


409!8 


841 


pkinase 


Eukaryotic protein kinase 
domain 


3.4e-100 


346.3 


844 


k i dos onia -L ±>z 
2e 


KiDosomal L22e protein family 


le-64 


228.4 


846 


IBR 


IBR domain 


9e-15 


62.5 


849 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 


851 


SET 


SET domain 


5e-30 


113.2 


852 


SRCR 


Scavenger receptor cysteine- 


0 


1025.4 
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SEQ ID 
NO: 


PFAM NAME 


" DESCRIPTION 


p-value 


PPAfl 

SCORE 






rich domain 






853 


SRCR 


Scavenger receptor cysteine- 
rich domain 


0 


1025.4 


857 


lactamase_B 


Metal lo-beta- lactamase 
super family 


0 .012 


-6,0 




COX6A 


Cytochrome c oxidase subunit 
Via 


3.4e-58 


206.7 


859 




RNA recognition motif. 


5.4e-45 


162.9 


861 


PRK 


Pho sp nor ibul ok inase 


5.1e-62 


219.4 


863 


mito_carr 


Mitochondrial carrier proteins 


2.9e-53 


18 5.5 


864 


HSP90 


Hsp90 protein 


4.7e-158 


S38.5 


866 


IS 


Immunoglobulin domain 


4e-12 


44.1 


867 


zf-C2H2 


Zinc finger, C2H2 type 


7e-135 


461.5 


872 


hi stone 


Core histone H2A/H2B/H3/H4 


4.9e-41 


149.8 


874 


CPSase L cha 
in 


Carbamoyl -phoepnate synthase 
(CPSase) 


■ 2.1e-218 


739.0 


879 


Rrbosomal SI 
2e 


Ribosomal protein S12e 


2.1e-98 


340.3 


382 


aerpln 


Serpins (serine protease 
inhibitors) 


2.5e-42 


145.7 


883 


Patatin 


Patatin 


1.2e-Sl 


182.0 


"884 


RA 


Ras association (RalGDS/AF-6) 
domain 


0.044 


8.0 


887 


DUF92 


Integral membrane protein DUF92 


2.7e-12 


54.3 


889 


sugar tr 


Sugar (and other) transporter 


8 .2e-63 


222.1 


893 


DUP28 


Domain of unknown function 
DUF28 


1.3e-43 


158.3 


896 


IP_tran3 


Phosphatidylinositol transfer 
protein 


6.5e~98 


338.7 


898 


DEAD 


DEAD/DEAH box helicase" 


1.5e-48 


156.5" 


899 


KE2 


KE2 family protein 


7e-*l 


215.7 


900 


KE2 


KE2 family protein 


4.3e-51 


183 .2 


901 


2f-C2H2 


Zinc finger, C2H2 type 


2.7e-57 


203.8 


902 


ras 


Ras family 


2.3e-75 


263.8 


904 


TPR " 


TPR Domain 


3.2e-22 


8?. 2 


906 


GBP 


Guanylate -binding protein 


8.9e-253 


853.1 


907 


GBP 


Guanylate-binding protein 


l.le-239 


809.6 


908 


WD40 


WD domain, G-beta repeat 


2_6e-26 


100.8 


909 


PH 


PH domain 


1.3e-G9 


39.4 


910 


2f-C2H2 


Zinc finger, C2H2 type 


2.5e-39 


144.1 


913 


Epimerase 


NAD dependent 

epimerase/dehydratase family 


Se-07 


-88.5 


921 


TBC 


TBC domain 


1 .5e-09 


30.7 


922 


WD40 


WD domain, G-beta repeat 


1.6e-25 


98.2 


923 


WD40 


WD domain, G-beta repeat 


8.2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2.9e-05 


29.1 


925 


UQ_con 


Ubiquitin-conjugacing enzyme 


0.00033 


-27.6 


926 


CH 


Calponin homology (CH) domain 


3.3e-53 


190.2 


928 


WD40 


WD domain, G-beta repeat 


5.9e-46 


172.7 


929 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-10 


37.4 


930 


Ribul P 3 ep 
ira 


Rabulose -phosphate 3 epimerase 
family 


7.2e-105 


361.8 


931 


Ribul_P__3_ep 
iro ~ " ! 


Ribulose -phosphate 3 epimerase 
family 


1.2e-96 


334.4 


936 


C2 " " 


C2 domain 


2.2e-62 


220.7 


937 


i\m "^l. am i xy 


Nucieosome assembly protein 
(NAP) 


l.le-22 


94. £ 


940 


abhydrolaee 


alpha/beta hydrolase fold 


0.011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3 .2e-07 


25.1 


948 


pkinase 


Bulcaryotic protein kinase 
domain 


3.4e-75 


263.2 


949 


WD40 


WD domain, G-beta repeat 


1.8e-27 


104.7 


9S0 


Acyl transfer 
ase 


Acyi transf erase 


1.6e-07 


38.4 
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SEQ ID 
NO : 


PPAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


951 


~ SAM — - — 


SAM domain (Sterile alpha 
motif) 


0.014 


14.5 


954 


wrU luti MocA 


Oxidoreductase family 


1.3e-ll 


52.0 




BIB 


BTB/POZ domain 


7e-22 


86.1 


956 


BTB 


BTB/POZ domain 


7e-22 


86.1 


957 


CDP- 


CDP- alcohol 

phosphatidyl transferase 


0.053 


-22.2 


"95" 9 ~ 


ras 


Raa family 


2.4e-97 


336.8 


960 


ras 


Ras family 


8.4e-43 


155.6 




Acetyl trans f 


Acetyl transferase (Gnat) family 


1 .2e-08 


42.2 


962 


adh short 


short chain dehydrogenase 


2.4e-31 


117.6 


963 


inutT 


Bacterial mutT protein 


5.6e-06 


26.2 


969 . 


IP-2B 


Initiation factor 2 subunit 
family 


8.4e-193 


653.9 


970 


RNase PH " 


3' exoribonuclease family 


9e-24 


92.4 


975 


WW 


WW domain 


5.7e-25 


96.4 


977 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


3.6e-21 


83.7 


978 


Ribosomal_Ll 


Ribosomal protein L17 


2.4e-20 


81.0 


979 


LIM 


LIM domain containing proteins 


S.8e-42 


152.8 


980 


Calsequestri 
n 


Calsequestrin 


1.7e-297 


1001.7 


982 


HSP20 " 


Hsp20/alpha crystallin family 


l.Ze-10 


43.2 


983 


oxidored_q6 


NADH ubiquinone oxidoreductase, 
20 Kd sub 


4.8e-63 


222.9 


988 


TBC 


TBC domain 


2.2e-50 


1B0.8 


999 


TBC 


TBC domain 


2.2e-50 


180.8 


993 


tRNA_int_end 
o 


tRNA intron endonuclease 


0.0017 


-34 .2 


994 


homeobox 


Homeobox domain 


4e-18 


73 . 6 


997 


pyr_redox 


Pyridine nucleotide-disulphide 
oxidoreducta 


0.012 


11.6 


1000 


mito_carr 


Mitochondrial carrier proteins 


9.7e-123 


421.2 


1001 


RA 


Ras association (RalGDS/AF-6) 
domain 


l.2e-15 




1004 

* 


• DUF81 


Domain of unknown function 
DUT81 


0. 099 


10.2 


1005 


act in 


Actin 


1.3e-l74 


574.3 


1006 


actin 


Actin 


3.1e-130 


428.6 


1007 


cpn60 — TCPl 


TCP-i/cpn60 chaperonin family 


3.7e-195 


661.8 


1008 


TPR 


TPR Domain 


8 .le-44 


159.0 


1009 


zf -C2H2 


Zinc finger, C2H2 type 


3.6e-61 


216.6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3.6e-61 


2~16\6" 


1012 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4.7e-15 


53.1 


1016 


tRNA- synt_2c 


tRNA synthetases class II (A) 


2.3e-15 


55.2 


1018 


RhoGA? 


RhoGAP domain 


1.6e-78 


274.3 






Phosphogly cerate mutase family 


3.8e-18 


69.7 


1026 


HMG_box ~ 


HMG (high mobility group) box 


8.4c-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ_con 


Ubiquitin- conjugating enzyme 


1.4e-49 


178.1 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


O.028 


1*.3 


1034 


Hydrolase 


haloacid dehalogenaoe-like 
hydrolase 


2e-21 


84.6 


1037 


KRAB 


KRAB box 


4.8e-06 


32.4 


1038 


Cation efflu 
X 


wuiua emux lamny 


7.le-42 


152 .5 


1040 


ART 


NAD;arginine ADP- 
ribosyl transferase 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


1.9e-18 


74.7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93 . 7 


1045 


lectin c 


Lectin C-type domain 


1.9e-28 


108.0 


1046 


ulucosamine_ 
iso 


Glucosamine - 6 -phosphate 
isomerase 


0.00013 


-25.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION ■ — 


p- value 


Tpfam 

J SCORE 


1047 


ligase-CoA 


CoA-ligases 


4 .5e-80 


J 279.4 




*g 


Immunoglobulin domain 


i.7e-09 


1 35.6 


1050 


Rioosomal_L2 
^e 


Ribosomal protein L24e 


2e-33 


" 124.5 


1054 


Amicfase 


Ami das e 


4.3e-152 


1 518.7 




rrm 


RNA recognition motif. 


3.Be-26 


J 100.3 


1058 


annexin 


Annexin 


6.9e-44 


159.2 


i65"9" — 


vvxifz audi 

XX 


PMP-22/EMP/MP20/Claudin family 


0.023 


-23.6 


1060 


i\OulCOJu OJt 


Homeobox domain 


3.2e-31 


J 117.2 


1062 


Acyx trans isr 
ase 


Acyl transferase 


o.ooq£s" " 


1 10.5" 


1064 


Junir- DXJialTig 


AMP-binding enzyme 


6.6e-l0O 


1 345.3 


1065 




Leucine Rich Repeat 


3 .3e-14 


1 60.6 


1066 


G7P1 OBG 


GTPl/OBd family 


4.8e-41 


1 141.8 


1071 




Immunoglobulin domain 


8.4e~48 


j 159.1 


1072 


PHD 


PHD- finger 


6.8e-07 


j 36.3 


1074 


DENN 


DENN (AEX-3) domain 


8.3e-33 


j 121.5 


1075 


SCP 


SCP-like extracellular protein 


4.7e-41 


j 149.8 


1077 


OLF 


Olfactomedin-like domain 


2.2e-66 


j 234.0 


1078 


mi to carr 


Mitochondrial carrier proteins 


le-42 


I 149.3 


1079 


WD4 0 


WD domain, G-beta repeat 


6.2e-45 


]~162.7 


jluu/ ; START 


START domain 


1.5e-4B 


^174.7 






Dual specificity phosphatase, - 
catalytic doma 


3.3e-63 


[223 .4 


1094 


GSHPx 


Glutathione peroxidases 


9.6e-41 


[148.8 


1095 


DUF25 


Domain of unknown function 
DTJF25 


2e-75 


264.0 


1096 


.DUF25 


Domain of unknown function 
DOF25 


6e-7S 


262.4 


1105 


Ni troreduc t a 
se 


Nitroreductase ramily 


1.3e-13 


58.6 


1106 


PT1S 


Phosphodiesterase family 


1.3e-179 


610.1 


1107 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


0.00049 


19.6 


1109 


ras 


Ras family 


1.3e-15 


40 .7 


1115 


ArfGap 


Putative OTP- ase activating 
protein for Arf 


9 .7e-47 


168.7 


1116 


HMG14 17 


HMG14 and HMG17 


4.4e-21 j 


83 .5 


1117 


HMG14 17 " 


HHG14 and HMG17 


9.9e-12 


52.4 


1119 


FAA_hydrolae 
e 


Fumarylacetoacetate (FAAJ 
hydrolase fam 


2e-83 


290.6 


1120 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-94 I 


327.6 


1123 


ahhydrolase 


alpha/beta hydrolase fold 


9.2e-23 j 


89.0 


1129 


pro_i s ome ras 
a 


Cyclophiiin type peptidyl- 
prolyl cis-tr 


2.2e-56 j 


197.1 


Til! 


DnaJ 


DnaJ domain 


1.6e-30 | 


114.9 


1132 


WD40 


WD domain, G-beta repeat 


1.3e-19 


78.6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-15 | 


64.9 


1134 


PH 


PH domain 


0.0015 


17.8 


1136 


Adap comp su 
b "~ 


Adaptor complexes medium 
subunit family 


1.2e-256 


866.0 


1137 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708.8 


1139 


ras j 


Ras family 


l.Se-86 


301.0 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 


Acyl transfer 
ase 


Acyl transferase 


1.2e-05 | 


29.9 


iiSa 


IRS 


PTB domain (IRS-I.type) 


5.4e-55 j 


196.1 


1155 


ig 


Immunoglobulin domain 


1.3e-31 j 


106,9 


1157 


Asparaginase 
_2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC_oxred 


SMC oxidoreductases 


4.7e-142 


485.3 


1160 


Zf-ANl " ; 


ANl-like Zinc finger 


0. 00021 j 


27.9 
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SEQ ID 

NO : 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1163 


lxnker^nisto 
ne 


lxnker hi stone Hi ana H5 family 


3.8e-14 


" 60.4 


1164 


UE.U 


ueacn ettector domain 


3.9e-05 


30.5 


J. J. bo 


IRS 


PTB domain {IRS- 1 type) 


2.6e-43 


157.3 


1166 


IRS 


PTB domain {IRS-1 type) 


2.6e-43 


157.3 


HDD 


SAM 


SAM domain (Sterile alpha 
motif) 


6.04 


10.5 


•FTTH 

X X / u 


abhydrolase 


alpha/ beta hydrolase Cold 


O.098 


-7.5 


±-L 1 % 




SAP domain 


3.9e-10 


47.1 


XX / / 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112 .5 


1178 


WD40 


WD domain, G-beta repeat 


4.76-3S" 


129.9 


1180 


Ets 


Ets-domain 


1.8e-09 


33.3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0.00016 


24.7 


1182 


TCX>1_MTCP1 


TCLl/MTCPl family 


9.5e-S6 


198.6 


1184 


RasGBF 


RasGKF domain 


1.7e-88 


307.4 


1185 


mito carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


i±a1 


UPAR LY6 


u-PAR/Ly-6 domain 


0.0042 


IS. 6 


1188 


Orn DAP Arg 
dec 


?yri doxal -dependent 
decarboxylase 


6.2e-128 


" 430.* 


1193 


Stathmin 


Stathmin family 


1.8e-90 


314.0 


1194 


Stathmin 


Stathmin family 


1.0e-90 


314.0 


1195 


Seel 


Seel family 


3.2e-183 


622.1 


1196 


pyr_redox 


Pyridine nucleotide- di sulphide 
oxidoreducta 


3.1e-32 


111.8 


1197 


Glyco transf 
8 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16.8 


1203 


adh short 


short chain dehydrogenase 


8.3e-45"-" 


162.3 " 


1206 


UJo leprae thylt 
ran 


ubiE/C0Q5 methyltransferase 
family 


i.3e-i2i 


417.4 


1208 


7tm 3 


7 transmembrane receptor 


7.2e-09 


29.0 


1209 


ank 


Ank repeat 


3.9e-15 


63.7 


1210 


vATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


2.5e-128 


439.7 


1212 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


efhand 


EF hand 


3.2e-07 


37.4 


1219 


rrm 


RNA recognition motif. 


2 .le-40 


147.7 


1220 


DUF^ 


Integral membrane protein DOF6 


0.015 


21.5 


L222 


SCAN 


SCAN domain 


1.5e-71 


251.1 


1223 


G- gamma 


GGL domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158.9 


1232 


PX 


PX domain 


2.2e-15 


■64.5 


1233 


PX 


PX domain 


2.2e-15 


64.5 


1236 


FCH 


Fes/CIP4 homology domain 


3.3e-09 


44 .0 


1241 


Peptidase_M2 
0 


Peptidase family M20/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 ( 


1247 


UPF0006 


Metalloenzyme of unknown 
function UPF0006 


6.3e-61 


215.8 


1248 


Glycos trans 
f_2 


Glycosyl transferases 


4.5e-10 


45. 9 


1249 


etftand 


EF hand 


4e-ll 


50.4 


"1254 * 


UQ_con 


Ubi qui txn- conjugating enzyme 


2.1e-73 


257.3 


1255 


ras 


Ras family 


2.2e-62 


220.7 


1256 


rormyl trans 
f 


Formyi transferase 


4. 9e-30 


108.3 


1259 


Zf-C3HC4 


Zinc finger, C3HC4 type {RING 
finger) 


5.3e-13 


46.4 


1261 


DiHtolate re " 
d 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


G_glu transp 
ept 


Gamma -glu tamyl transpept idase 


1.8e-110 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 " ' 


LRR 


Leucine Rich Repeat 


4.2e-22 


86.9 
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seq Id 

NO: 


PFAM NAME 


DESCRIPTION 


p- value 


' PFAM 
SCORE 


1266 


SCP 


SCP-like extracellular protein 


6e-29 


108.0 


1267 


K_tetra 


channel tetramerisation 
domain 


2.8e-27 


104.0 


1269 


ras 


Ras family 


1.3e-85 


297.9 


127$ 


Zt-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4.2e-10 


37.0 


1276 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


89.8 


1277 


abhydrolase 


alpha /beta hydrolase fold 


5.6e-21 


83.1 


1279 


trypsin 


Trypsin 


4 .4e-41 


132.0 


1280 


PbP 


Phospha t idyl ethanol amine - 
binding protein 


I.3e-13 


58.7 


1285 


zi-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.6e-14 


49.6 


1287 


ank 


Ank repeat 


1.7e-52 


187.8 


1294 


fn3 


Fibronectin type III domain 


0.026 


20.9 


1295 


GBP 


Guanylate -binding protein 


0.00026 


-70.0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


6.9e-41 


149.3 


1297 


Rhodanese 


Rhodanese -like domain 


3.2e~14 


60.7 


129B 


LIM 


LIM domain containing proteins 


5.8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4.9e-43 


145.2 


1307 


mi to_carr 


Mitochondrial carrier proteins 


2.1B-53 


186.0 


1308 


WD40 


WD domain, G-beta repeat 


i.6e-17 


71.6 


1310 


UPAR LY6 


u-PAR/Ly-6 domain 


7.le-20 


75.5 


1313 


thiored 


Thioredoxin 


3 . 6e-05 


21.6 " 


13 14 


Aa_trans 


Transmembrane amino acid 
transporter protein 


1.5e-67 


237.9 


1316 


trypsin 


Trypsin 


4 .4e-41 


132.0 


1320 


Ribosomal I»l 
3 


Ribosomal protein L13 


3. 9e-62 


219.8 


1327 


Armadillo_se 

g 


Armadi 1 1 o/befca - ca t e ni n - 1 ik.e 
repeats 


0 . 0054 


23 .4 


1328 


KRAB 


KRAB box 


0.052 


-5.6 


1329 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1330 


Bcl-2 


Apoptoais regulator proteins, 
Bcl-2 family 


0.014 


-1.6 


1331 


PX 


px domain 


2 . le-10 


48 . 0 


1333 


KRAB 


KRAB box 


1. 8e-36 


134 . 6 


1334 


UPP_syntheta 

BG 


Putative undecsprenyl 
diphosphate synt 


2.3e-09 


310.3 


1335 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


l.Se-59 


211.0 


1336 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


i.2e-31 


118.6 


1337 


Jdspc 


Dual specificity phosphatase, 
catalytic doma 


2.3e-12 


54.5 


133 8 


TPR 


TPR Domain 


0.00021 


28.1 


1340 


metal thio 


Metal lothionein 


0.013 


20.3 


1341 


mutT 


Bacterial mutT protein 


5.8e-09 


36.5 


1343 


Band 41 


PERM domain (Band 4.1 family) 


1.3e-38 


122.5 


1344 


Kelch 


Kelch motif 


1.4e-44 


161.5 


134* 


Antifreeze 


Antifreeze protein 


1.2e-10 


4B.8 


1347 


3Beta_HSD 


3-beta hydroxysteroid 
dehydrogenase/ieomera 


0.086 


-177.2 


1348 


BTB 


BTB/POZ domain 


5.3e-28 


106.5 


1349 


bUP6 


Integral membrane protein DUF6 


0.033 


15.8 


1350 


myosinjiead 


Myosin head (motor domain) 


0 


1088.7 


1352 


Nrarap 


Natural resistance- associated 
macrophage pro 


1.2e-202 


686.6" 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3.6e-65 


209.0 


1356 


C2 


C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203.1 | 


1360 


zf-C2H2 


Zinc Einger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14 17 


HMG14 and HMG17 


7.9e-40 


145\'7 
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SBQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


"PFAM 
SCORE 


1362 


SIS 


SIS domain 


3.8e-30 


113 .6 


13 63 


SIS 


SIS domain 


1.3e-28 


108.5 


13 64 


ig 


Immunoglobulin domain 


0.00026 


19.0 


1368 


K_tetra 


K+ channel tet earner i sat ion 
domain 


l.le-lg 


68.9 


1371 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2.2e-113 


390.1 


1372 


DniaJ 


DnaJ domain 


6.68-36 


132.7 


1376 


KRAB 


KRAB box 


2.1e-38 


141.0 


1378 


ELM2 


ELM 2 domain 


2e-23 


91.3 


1380 


thiored 


Thioredoxin 


1.2e-23 


82 .8 


1381 


auk 


Ank repeat 


2.3e-83 


290.4 


1382 


BTB 


btb/poz domain 


3e-ll 


50.8 


13B3 


WD40 


WD domain, G-beta repeat 


1.6e-19 


78.3 


1384 


WD40 


WD domain, G-beta repeat 


6.3e-24 


92.9 


1387 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-09 


35.4 


1389 


2C-C2H2 


Zinc finger, C2H2 type 


5.5e-50 


"179. 5" 


1390 


z E-C2H2 


Zinc finger, C2H2 type 


2.5e-85 


296.9 


1393 


kinesin 


Kincsm motor domain 


7.8e-188 


637.4 


1394 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-49 


178.4 


1398 


KRAB 


KRAB box 


5.1e-22 


"86". 6 


1402 


bZIP 


bZIP transcription factor 


0.03S 


13.1 


1405 


sugar_tr 


Sugar (and other) transporter 


0.003 


-101.5 


140C 


RhoGAP 


RhoGAP domain 


B.9e-47 


168.6 


1407 


rrm 


RNA recognition motif. 


le-35 


132.1 


1408' 


LRR 


Leucine Rich Repeat 


2.1e-13 


58.0 


1409 


Nebulin repe 
at 


Nebulin repeat 


6e-54 


192.6 


1410 


anJc 


Ank repeat 


1.6e-17 


71.6 - 


1412 


Ribosomal_L5 
c 


ribosomai L5P family c- terminus 


B.2e-5B 


205.5 


1415 


trypsin 


Trypsin 


4.7e-85 


.270.4 


1416 


aminotran 1 


Aminotransferases class -I 


4.4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1.5e-C7 


33.1 


1419 


WD4 0 


WD domain, G-beta repeat 


2.2e-09 


44.6 


1422 


cadberin 


Cadherin domain 


8.3B-42 


152.3 


1424 


SH3 


SH3 domain 


2.5e-80 


280.3 


1425 


PHD 


PHD- ringer 


3 .2e-17 


70.6 


1426 


PHD 


PHD- finger 


3 .2e-17 


70.6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


138.8 


1428 




helicase_C 


Helicases conserved C- terminal 
domain 


le-26 


102.2 


1429 


WD40 


WD domain, G-beta repeat 


3 .9e-07 


37.2 


1430 


Inositol P 


Inositol monophosphatase family 


2.5e-10 


40.2 


1431 


mi to carr 


Mitochondrial carrier proteins 


4.3e-83 


287.7 


1433 


Ciq 


Clq domain 


2.9e-16 




1434 


WD40 


WD domain, G-beta repeat 


1.6e-13 


58.3 


1435 


Inos-i- • 
P__ synth 


Myo-inositol-1 -phosphate 
synthase 


7e-228 


716.4 


1 A "3 C 


rrm 


RNA recognition motif. 


1.4e-34 


128.3 


1438 


ig 


immunoglobulin domain 


1.3e-12 


45.6 


1 AAC\ 


G_AdapC_ CT 


Gamma- adapt in, c- terminus 


3.4e-67 


236.7 


1 A A 1 


G__Adapt_CT 


Gamma-adaptin, C- terminus 


3.4e-67 


236.7 




Kalch 


Kelch motif 


0.00013 


28.7 


T A A C 


ARID 


ARID DNA binding domain 


1.8e-21 


84.7 


1447 


zf-C2H2 




9 .4e-28 


105.6 


1448 


AMP-binding 


AMP-binding enzyme 


2.6e-07 


-145.1 


1451 


rrm 


RNA recognition motif. T 


6,5e-21 


82.9 


1454 


*9 


Immunoglobulin domain 


5.6e-44 


146.7 


1455 


Sialyl trans f 


Sialyl transferase family 


5.4e-21 


83 .2 


1460 


Aldose epim 


Aldose l-epimerase 


1.9e-35 


131.2 


1461 


C2 


C2 domain 


4e-18 


73 .S 


1470 




IPT/Tto domain 


3.1e-19 


77.3 


1472 




RNA pseudouridylate synthase 


4.3e-16 


66.9 
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SEQ ID 

NO: 


PFAM NAME 


T DESCRIPTION 


p- value 


PFAM 
SCORE 


,, 


h 2 








2.474 




DENN (AEX-3) domain 


1.3e-44 


161.6 


l47<* 


~ Cation_ef f lu 

X 


Cation efflux xamily 


4.6e-49 


176.4 


1477 


TBC 


TBC domain 


" "Se-47 


169.0 


1478 


rnn 


RNA recognition motif. 


2e-21 


84 .6 


1480 


iS 


Immunoglobulin domain 


5.5e-06 


24.3 


1484 


Telo_bind_al 
pha 


Telomere-binding protein alpha 
subuni 


0.028 


-225.9 


1485 


2I-C2H2 


Zinc finger, C2H2 type 


1.8e-68 


" 240.4 


1486 


pkinase 


Eukaryotic protein Kinase 
domain 


9.5e-13 


49.9 


1488 


helicase__C 


Helicaaes conserved C- terminal 
domain 


" 1.4e-15 


" 65.2 


1483 


DUF89 


Protein o£ unknown function 
DUF89 


" 0.079 


-132.4 


1490 


ECH 


Enoyl-CoA hydratase/i3omerase 
family 


5.2e-41 


" 149.7 


1491 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5.9e-46 


' 166.1 


1492 


IiRR 


Leucine Rich Repeat 


3.4e-19 


77.2 


1495 ' 


2t-C3riC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.1e-10 


36.3 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


le-22 


85.8 ~ 


1500 


SH3 


SH3 domain 


9.3e-05 


27.2 


1502 


homeobox 


Homeobox domain 


0.084 


13.8 


1503 


homeobox 


Homeobox domain 


0.084 


13 . 8 


1505 


EGF 


EGF- like domain 


2.7e-23 


90.8 


1506 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


2.7e-21 


84.2 


1508 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2 .8e-28 


101.8 


1S11 


PX 


PX domain 


1.9e-ll 


51.5 


1S12 


Sulfatase 


Sulfatase 


2.8e-3S 


130.7 


1516 


Syntaxin 


Syntaxin 


0.011 


-62.3 


1518 


aminotran_3 


Aminotransferases class- III 
pyridoxal -pho 


9,7e-106 


305.6 


1520 


*S 


Immunoglobulin domain 


0.075 


11.0 


1521 


RA 


Ras association (RalGDS/AF-6) 
domain 


0.013 


13.3 


"1523 


RhoGAP 


RhoGAP domain 


2.Se-05 


10.7 


1528 


WD40 


WD domain, G-beta repeat 


5.4e-24 


93.1 


1535 


IMS 


impB/mucB/samB family 


7.8e-95 


328.5 "~ 


1538 


FYVE 


FYVE zinc finger 


3.2e-27 


101.5 


1539 


DAGKc 


Diacylgiycerol kinase catalytic 
domain 


6e-07 


3*. 5 


1540 


Ocular_alb 


Ocular albinism type 1 protein 


0 


1184.7 


1653 


SAP 


SAP domain 


6e-06 


33.2 


1654 


Amino_pxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 ■ 




Amino_oxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 " 


XD3Q 


RnoGEF 


RnoGEF domain 


1.4e-24 


95.1 


1657 


rJMK HbKl 


GTPase of unknown function 


0.0011 


-45.5" 


1659 


UOi-2 


ubiquitin carboxyl-terminal 
hydrolase family | 


2.5e-ll 


51.1 


1660 


actin 




6 ,6e-2I 


69 . 9 


1661 


BAH 


BAH domain 


1.7e-82 "'■ 


287. S 


1662 


vwa 


von Wiliebrand factor type A 
domain 


o ■ " 


1909.4 " 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237.9 


1667 


zl:-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324.4 


1669 


Noil_Nop2_Su 
n 


NOLl/NOP2>/sun family 


1.3e-23 


84.3 


1671 


SH2 


Src homology domain 2 


5.4e-l5 


46.9 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1672 


chromo 


'chromo 1 ICHRromatin 
Organization Modifier) 


2.ie-18 


67.7 


"i g 7 A 

AO i'k 


Zt-CCCri 


Zinc finger C-x8-C-x5-C-x3-H 
type 


0.0025 


17.6 




Cm 1 \t/*f* Vi^*rl-»-rt 

viiyco nycixo 
47 


Glycosyl hydrolase family 47 


1.8e-187 


636.2 


1677 


47 


Glycosyl hydrolase family 47 


4.5e-74 


259.5 


1680 


WD4 0 "" 


WD domain, G-beta repeat 


l.lc-27 


105.5 


1681 


WD40 


WD domain, G-beta repeat 


l-le-27 


105.5 


1 Col 




Grease of unknown function 


1.8e-78 


274.1 


ID"! 


crm 


RNA recognition motif. 


1.8e-37 


137.9 


1692 


rrm 


RNA recognition motif. 


l.Be-37 


137.9 




AAA 


ATPaaes associated vith various 
cellular act 


1.3e-81 


284.5 


1697 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


8.4e-83 


285.2 


1698 


Ferric_ re due 
t 


Ferric reductase like 
transmembrane com 


3.5e-53 


190.1 


1699 


zf - C2H2 


Zinc finger, C2H2 type 


4.4e-34 


126.6 


17 00 


art 


ADP-ribosylation factor family 


9e-i9 


75.8 


1702 


GTP_EFTU 


Elongation factor Tu family 


0.014 


11.4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194.4 


1707 


pkinaae 


Eukaryotic protein kinase 
domain 


1.2e-88 " 


-307.9 


1709 


WD40 


WD domain, G-beta repeat 


0. 0035 


24.0 


1710 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


1711 


WW 


WW domain 


7.6e-12 


52.3 


1712 


ank 


Ank repeat 


4.2e-34 


126.7 


1713 


zf-CCCH 


Zinc finger C-x3-C-x5-C-x3-H 
type 


2.6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-xB-C-xS-£t-x3-ri 
type 


2.6e-09 " 


3B.3 


1715 


ras 


Ras family 


4.4e-41 


149.9 


1718 


HMG box 


HMG {high mobility group) box 


8.3e-21 


82. 6 


1719 


TBC 


TBC domain 


l.le-45 


165.2 


1721 


KLH 


Helix- loop -helix 3NA- binding 
domain 


9.2e-10 


45.9 


1723 


dsrm ] 


Double- stranded RNA binding 
motif 


2.9e-05 


30.9 


1724 


RmaAD 


Ribosornal RNA adenine 
dimethylases 


0.045 


9.2 


1725 


CIDE-N 


CIDE-N domain 


5.9e-40 


'146.2 


1725 


HAT 


HAT (Half -A-TPR) repeats 


2.9e-44 


160.5 


172 3 


ef hand 


EF hand 


5.1e-20 


79.9 


1733 


Hist_deacety 


Histone deacetyiase family 


1.7e-104 


3*0. * 


1735 


LRR 


Leucine Rich Repeat 


4.6e-34 


126.6 


"1739 " 


~PI-PLC~X 


Phosphatidylinoaitol- specific 
pnpspnoj-ipase 


0.0023 


16.1 


1743 


2T3S 


Kas lamiiy 


3 . 7e-10 


-21.3 


1744 




fido i. dim j. y 


3 . 7e-10 


-21.3 


i74* 


RasGEF 


Rdavflf aoraain 


3 .2e-49 


176. 9 


1746 


******* OilVfc 1* 


short chain dehydrogenase 


7 .le-08 


34.6 


1751 


zf-C2H2 f 


*»*nc linger, u^hz type 


9e-39 


142.2 


1754 


En3 


rioronecLin cype ill noma in 


5.5e-101 


348.9 


175"6 


zf«C2H2 ~ " 


«inc linger, C2H2 type 


6.3e-93 


322.1 


1758 


rrm 


RNA recognition motif . 


0 . 017 


21 . 2 


1760 


Nop r 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1765 


MMR_HSR1 


GTPa3e of unknown function 


6.4e-41 


149.4 


1769 1 


CN_hydrolase 


Carbon- nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4 .le-07 


37.1 


1779 


Oxysterol_BP 


Oxysterol- binding protein 


4.7e-56 


199.6 


1783 


RhoGKF 


RhoGBF domain j 


1.6e-23 


91.6 


1784 


RhoGBF 


RhoGEF domain 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1785 


rrm 


RNA recognicion motif. 


6.4e-14 


59.7 



TRADOCS:141 6227.1 (%CRN0 1 l.DOC) 
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TABLE 5 



SBQ ID NO: 


POSITION OF 


MaxS (MAXIMUM 


MeanS (MB AN 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQULNCE 






x 


1-21 


0 .991 


0 . 955 


« 


1-31 


0 .995 


0 . 944 


■a 
<9 


1-33 


0 .949 


0.736 


4 


1-19 


0 -970 


0 .951 


5 




0 .971 


0.863 


«- 
o 


1-26 


0 .971 


0.863 




1-26 


0.971 


0.863 


8 




1-26 


0 .971 


0.863 


9 


1-46 


0.982 


0.901 


10 


1-21 


0 .991 


0.955 


11 


1-23 


0 .989 


0.899 


12 


1-25 


0.955 


0.803 


13 


1-18 


0.932 


0.625 


14 


1-18 


0.938 


0.876 


15 


1-25 


0.941 


0.811 


16 


1-17 


0.972 


0.S39 


17 


1-27 


0.964 


0.777 


18 


1-16 


0.914 


0.657 


19 


1-19 


0.953 


0.840 


20 


1-20 


0.935 


O.701 


21 


1-22 


0.974 


0.850 


22 


1-33 


0.961 


0.895 


23 


1-19 


0.991 


0.959 


2a 


1-31 


0.995 


0.944 


25 


1-22 


0.976 


0.935 


25 


1-27 


0.996 


0.928 


27 


1-24 


0.953 


0.739 


28 


1-21 


0.906 


0.688 


29 


1-31 


0.986 


0.841 


30 


1-28 


0.980 


0.893 


31 


1-19 


0.993 


0.976 


32 


1-22 


0.998 


0.909 


35 


1-33 


0.949 


0.736 


36 


1-33 


0.949 


0.736 j 


46 


1-19 


0.570 


0.951 


67 


1-25 


0.968 


0.848 


71 


1-18 


0.949 


0.845 


72 


1-3D 


0.991 


0.919 


75 


1-29 


0.958 


0.854 


88 


1-20 


0.9B6 


0.945 


94 


1-33 


0.994 


0.943 1 


97 


1-06 


0.964 


0.595 




1-4 9 


0 . 983 


0.570 


108 


1-26 


0 . 978 


0.885 


ill 


1-23 


0.989 4 


0.899 


IOC -4 


1-25 


0 . 955 


0.803 


129 


1-19 


0 . 963 


0.918 1 


138 


1-29 


0 . 971 


0. 844 


143 


l lo 


0 . 914 


0.628 


148 


1-20 


0 .969 


0.904 


156 


1-25 


0 .941 


0.811 


158 


1-22 


0 . 979 


0.927 


160 


1-17 


0 . 972 


ft Q 


161 


1-48 


0.903 


0.571 


162 


1-25 


0.937 " 


0.729 | 


168 


1-16 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


178 


1-21 " — 


0.945 


0.825 


180 


1-27 


6.9B1 


0.941 


187 


1-28 


0.962 


0.936 


190 


1-19 ~ 


0.953 i 


0.840 


196 


1-22 


0.975 


0".9l6 


197 


1-22 


0.<^3 


0.936 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 




1-20 


0,935 


0 .701 


200 


1-23 


0 . 977 


0 .773 


206 


1-30 


0 .984 


0 .890 


207 


1-19 


0 . 990 


0.924 


~2~o3 


1-22 


0 .974 


0 .350 




1-40 


0 . 94 0 


0 .670 


211 


1-28 


0 .971 


0.849 




1-24 


0.986 


0.956 


21 8 


1-33 


b.96"l 


0.895 


-ixy 


1-19 


0 .970 


0.871 




1-19 


0 .904 


0.553 


222 


1-21 


0.917 


0.555 


23 0 


1-19 


0 .991 


0.959 


23 1 


1-26 


0 .953 


0.800 


232 


1-25 


0.988 


0.826 


239 


1-23 


0.969 


0.828 


240 


1-17 


0.982 


0.955 


241 


1-17 


0 .982 


0.95* 


245 


1-30 


0.970 


0.722 


248 


1-22 


0.976 ■ 


0.935 


249 


1-23 


0.968 


0. 94 0 


252 


1-18 


0.971 


0.923 


261 


1-24 


0.883 


0.587 


265 


1-18 


0.939 


0.868 


272 


1-24 


0.953 


0.739 


283 


1-21 


0.906 


0.688 


284 


1-29 


0.997 


0.854 j 


290 


1-31 


0.986 


0.841 


302 


1-28 


0.980 


0.893 


304 


1-16 


0.907 


0.635 


312 


1-19 


0.993 


0.976 


313 


1-17 


0.930 


0.753 


323 


1-22 


0.998 


0.909 


324 


1-17 


0.982 


0.954 


328 


1-19 


0.971 


0.865 


329 


1-22 


0.963 


0.924 


330 


1-33 


0,978 


0.841 


331 j 


1-24 


0.920 


0.712 


332 


1-24 


0.975 


0.881 


333 


1-19 


0.984 


0.941 


334 


1-20 


0.899 


0.567 


335 


1-27 


0.942 


0.813 


33b 


1-20 


0.952 


0.850 


337 


1-38 


0.942 


0.653 


338 


1-27 


0.973 


0.772 




1-36 


0. 979 


0.804 




1-27 


0 . 888 


0.597 


343 


1-19 


0 .971 


0 .865 


344 


1-22 


0 .994 


0 .928 


345 


1-17 


0 .966 


0.687 j 


346 " 


1-19 


0.93 6 


0. 822 


347 


1-22 


0 . 963 


0.924 | 


349 


1-24 


0 . 9B2 


0.966 


351 


1-21 


0 .918 


0.815 


352 


1-31 


0.988 


0 . 912 


354 " " 


1-31 


0. 974 


0.839 


355 


1-29 


0 - 932 


n cii 

u • OJZ 


355 


1-15 


0.394 


0.969 


357 "~ 


1-33 


0.935 


0.726 


360 1 


1-27 


0.938 


0.827 


361 


1-25 


0.954 j 


0.674 


362 


1-22 


0.929 


0 .788 


3 63 


1-21 


0.681 


0.715 


364 


1-33 


0.978 


0.841 


365 


1-33 - 


0.978 


0.841 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SC0R3) 


MeanS (MEAN 
SCORE) 


366 


1-21 


0.916 


0.820 


367 


1-19 


0 .93 6 


0.822 


368 


1-29 


0.972 


0.874 


370 


1-24 


0 .920 


0.712 


371 


1-24 


0.96"l 


0.172 


372 


1-27 


0.919 


0.768 


373 


1-19 


0.986 


0.945 


375 


1-32 


0.994 


0.932 


376 


1-34 


0.987 


0.810 


377 


1-17 


0.995 


0.950 


378 


1-49 


0.971 


0.749 


380 


1-20 


0.968 


0.874 


381 


1-20 


0.92 8 


0.782 


362 


1-19 


0.986 


0.934 


3 83 


1-28 


0.965 


0.829 


384 


1-39 


0.970 


0.551 


386 


1-24 


0.975 


0.881 


388 


1-30 


0.989 


0.868 


389 


1-19 


0.984 


0.941 


390 


1-26 


0.971 


0.782 


392 


1-20 


0.981 


0.900 


393 


1-16 


0.968 


0.890 


394 


1-23 


0.937 


0.701 


397 


1-22 


0.985 


0.854 


399 


1-46 


0.977 


0.698 


401 


1-20 


0.899 


0.567 


402 


1-22 


0.967 


0.931 


403 


1-27 


0.992 


0.934 


404 


1-19 


0.991 


0.973 


405 


1-23 


0.994 


0.921 


407 


1-35 


0.987 


0.658 ] 


408 


1-3 9 


0.976 


0.551 


409 


1-33 


0.897 


0.570 


410 


1-25 


0.990 


0.962 


411 


1-38 


0.97^ 


0.827 


412 


1-20 


0.944 


0.768 


413 


1-20 


0.988 


"0.965 


414 


1-46 


0.993 


0.638 


415 


1-23 


0.981 


0.940 


417 


1-29 


0.941 


0.672 


418 


1-20 


0.952 


0.8S0 


419 


1-19 


0.986 


0.967 


420 


1-29 


0.965 


0.861 


421 


1-22 


0.889 


0.78S 


422 


1-48 


0.962 


0.862 


424 


1-19 


0.979 


0.933 


428 


1-38 


0.342 


0.653 


430 


1-18 


0.947 


0.595 


432 


1-33 


0.957 


0.789 


433 


1-26 


0.979 


0.904 


434 


1-27 


0.962 


0.777 


it-aC ■ 


1-24 


0.998 


0.977 




1-27 


0.973 


0.772 


443 


1-15 j 


0.966 


0.940 


448 


1-36 


0.979 


0.804 


453 


1-41 


0.958 


0.609 


455 


1-33 


0 . 943 


0 .606 


457 


1-27 


0.888 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0. 687 


510 


1-23 


0.930 


0.593 
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v>oy XL/ N\Ji 


POSITION OF 
i>I<3NAX» IN AMINO 


M3XS (MAXIMUM 
SCORE) 


Means (MEAN 
SCORE) 


six 


1-23 


0 . 930 


0 .593 


512 


1-23 


" ft q-j/> 


0 . 593 


~S15 


1-18 


ft 070 


0 . 956 


523 


1-19 


1/ • 7 J w 


0 . 822 


529 


lf-22 


u - 70 J 


0 . 924 


54S 


1-24 


n qa 9 


0 . 966 


550 


"1-30 




0 . 713 


552 


i-2i 


U • 27 # J 


0 . 912 




1-23 


ft QC Q 


0 . 784 


571 


1-21 


U . 31 □ 


0 . 815 


574 




0 . 988 


0 .912 


S80 


A" J y 


0 . 525 


0 . 556 


594 


J,— J J. 


0 .974 


0.839 


608 


1 "n d 

JL" Z 3 


0 .932 


0.632 


609 


JL-4J 17 


O . S32 


0.632 


610 




6 . 990 


0.940 


621 


1- 1 5 


O .994 


0.969 


623 


1-33 


0.93 5 


0.726 


£53 


1-2 7 


0 .938 


0.827 




1-22 


0 . S29 


0.788 


"Sr5 — 

O / / 


1-16 


0 . 948 


0 . 807 




1-21 


0 . 881 


0.715 


"699 




0 .975 


0.816 


rUi 


1-31 


0.968 


0.898 


/ u / 

,_ 


1-16 


0.860 


0.562 


/ 13 


1-25 


0.966 


0.743 


/ID 


1-19 


0.93 6 


6.822 




1-20 


0 .961 


0.824 


/ at? 


1-29 


0.972 


0.874 


73S 


1-46 


0 .903 


0.598 


/ * t> 


1-14 


0.916 


0.730 


747 


1-22 


0 .965 


0.876 


7 a "a 


1-29 


0.968 


0.785 




1-24 


6 .961 


0.773 


767 


1-27 


0 .919 


0.768 


768 


1-33 


0 .900 


0.585 


773 


1-42 


0 .959 


0.702 J 


779 


1-19 


0 .986 


0.945 


797 


~i — To " — 

jl— j. a 


0 .944 


0.759 


798 


A— JL 3 


0 .900 


O. 568 


820 


1-17 


0.995 


6.950 


827 


1-49 


0 . 971 


0 . 749 


848 


1-20 


0 - 968 


0. 874 


864 




0 . 926 


0 . 782 


866 


1-19 


0 . 986 


0. 934 


873 


1-23 


0 . 948 


0 . 886 


881 


1-28 


0 . 965 


0 . 829 


887 


1-39 ' 


0 . 970 


0. 551 


927 


1-30 


0 . 989 


0 . 86B 


934 


I.49 


0 . 988 


0 . 777 


939 


1-39 




0 . 889 


944 


1-26 


n Q*7i 


0 . 782 


950 


1-29 


u . j*o / 


0 . 845 


963 


1-20 


U . yoi. 


0 , 900 


964 


1-20 


0 » 886 


0 . 556 


973 


1-16 


0 . 968 


0 . 890 


980 


1-34 


0.961 


0 . 749 


981 


1-20 


0.953 


0.822 


984 


1-12 


0.938 


0.780 


1015 


1-22 


0.985 


0.854 


1040 


1-46 


0 . 977 


0.698 


1052 


1-18 


0.969 


0.842 


F1059 


1-20 


0 . 927 


0.867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


3.993 " " 


0.935 
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SEQ ID NO: 


S I GNAT. IN AMINO 
ACID SEQUENCE 


MaxS ^MAXIMUM 
oWRfi ) 


MeanS {MEAN 
SCORE) 


1075 


1-27 




0 . 934 


1080 


1-19 




0 . 829 


1092 


1-19 




0 . 973 


1094 


1-46 


0 ,992 


0 . 653 


1095 


1-30 


0 . 974 


0 . 929 


1105 


1-23 




0.921 


1123 


""l-3S 


0.987 


0 . 658 


1138 


1-32 


0.954 


0 . 613 


114 0 


1-39 


0.989 


0 . 789 


1142 


1-33 


0.897 


0 . 570 


1152 




0 . 990 


0 .962 


■ixlo 


1-38 


n 0*7*7 


0 . 827 


1176 


1-20 


0 . 94 4 


0 . 768 


1187 


1-20 


n ODD 


0 . 965 


1189 


1-35 


0 . 967 


0.839 


1192 


1-46 


0 . 993 


0.638 


1193 


1-16 


0 . 925 


0.710 


1197 


1-29 


0 . 985 


0.853 


1208 


' i-23 


0 .981 


0.940 


1225 


1-29 ~ 


0 .941 


0.672 


1245 




0 .986 


0.967 


1258 — 




0 .565 


0.861 


1265 




0 . 889 


0.785 


1266 


1-20 ~ 


0 .944 


0.809 


1276 


x— *= o 


0 .982 


0.862 


1292 


X — X 7 


0 .979 


0.933 


1296 — — 


X — Z J. 


0 . 984 


0.944 


1297 


X— 19 


0 .984 


0.953 


1332 


1-38' ■ 


0 .942 


0.653 ~~ 


1358 




0 .947 


0.595 


1371 


X"JJ 


0 .957 


0.789 


1380 


1-26 


0 .979 


0.904 


1397 


1-27 


0 . 962 


0.777 


1399 




0 . 997 


0.960 


1404 


x— 


0 .998 


0.977 


1410 




0 . 946 


0.845 


1414 


1—24 


0 . 913 


0.588 


1415 


1-19 


0 . 982 


0 .929 


1416 




0 . 931 


0 .891 


1418 


1-30 


0 * 933 


6.563 


1420 


1-20 


0.881 


0 .561 


1421 


1-19 


0 . 990 


0 . 968 


1423 


1-17 


U.300 


0. 863 


1424 


1-21 


v . cob 


0 . 591 


1425 


1-24 




0 . 588 


142 6 


1-24 




0.588 


1428 


1-25 




0 . 899 


1430 


1-34 




0 . 819 


1431 


1-28 




0 . 923 


1432 


1-36 


0 . 957 1 - 


0 . 613 


1433 


1-32 




0 . 753 


1434 


1-33 


0 . 303 


0 . 621 


1435 


1-25 


rs qi n 

u • «" , 4.U 


0 . 631 


1436 


1-42 


U • JOS 


0 . 868 


1437 


1-22 


0.998 


0.960 


1442 


1-20 




0 . 753 


1448 


1-12 


0.931 


0.891 


1462 


1-1B 


0.968 


0.888 


1490 


1-20 


0.881 I 


o. 


l£l8 


1-17 


0.968 


0.B63 


1525 


1-21 


0.885 


0.591 


1547 


1-28 


0.974 


0.891 1 


xbbl 


1-25 


0.967 


0.899 


TSbo 


1-17 


0.923 


0.824 


1593 


1-28 


D.979 


3.923 
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POSITION OP 
SIGNAL IN AMINO 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1596 


"iic 


0 - 929 


0.709 1 


1601 




0 . 957 


0.613 


1606 


1-22 


0 . 979 


0 .831 


1607 


1-20 


0 . 974 


0 . 770 


1608 


1-32 


0 . 921 


0 .753 


1614 


1-33 


0.969 


0 .829 


"lgifi 


1-20 


0.959 


0.669 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


0.910 


0.631 


1636 


1-33 


0.897 


0.591 


"1639 


1-42 


0.988 


0.868 


1645 


1-20 


0.927 


0.568 


1647 


1-17 


0.923 


0.742 


1648 — ~ 


1-22 


0.998 


0.980 



TRADOCS:14I6234.!(%CR%0] I.DOQ 
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TABLE 6 



SEQ ID NO: 
Of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


cpn Tr> wn • 

KJ »- l»U£Jl Lp JL M 

nucleot ide 
sequence 


otu\i XL) 

NO; 

peptide 
sequence 


Priority 
docket number 
cor re spond x ng 

priority 
app 1 i ca t ion 


J SEQ ID 
NO: in 
U.S .S . N. 


1 


1787 


3573 


5359 


784CIP2 1 


1103 


2 


1788 


3574 


5360 


784CIP2 2 


AO #3 


3 


1789 


357.5 


536.1 


784C^P2~3 




4 


1790 


3576 


5362 




3 33b 


5 


1791 


3577 


5363 


7A4rTD5 c 
/ o*i\*j.ir& 3 


5562 


6 


1792 


3578 


53 64 




ceo 

33DZ 


7 


1793 


3579 


5365 




5562 


a 


1794 


3580 


5366 




5562 


9 


1795 


3581 


5367 


'Q4l.lr<3 9 


5563 


10 


1796 


3582 


5368 


/B4CIP2 10 


5564 


1 ii 


1797" 


3583 


5369 


io%\*±¥4 11 


5565 


12 


1798 




ZTTn 


/o4LlP2 12 


5689 


13 


1799 


3585 


33 /I 


784CIP2 13 


5729 


14 


1800 


1COC 
J DO O 


33 /Z 


784CIP2 14 


5745 


15 


1801 


3587 


33 / 3 


784CIP2 15 


5777 


16 


1 802 




□ 3 l*k 


784CIP2 16 


5777 


17 


1803 




5375 


784CIP2 17 


5789 


18 


1804 


3 590 


5376 


784CIP2 18 


5792 


19 


1805 


3591 


5377 


784CIP2 19 


5804 


20 


1806 


3592 


537 B 


784CIP2 20 


5805 


21 


1807 


3593 


5379 


784CIP2 21 


5805 


22 


1808 


3 594 


5380 


784CIP2 22 


5844 


23 


1809 




5381 


784CIP2 23 


5844 


24 


1810 


3 596 


5382 


784CIP2 24 


5850 


25 


1811 


3597 


5^38 3 


784CIP2 25 


5867 


26 


1812 




5384 


784CIP2 26 


5973 


27 


1813 


3599 


5385 


784CIP2 27 


5995 


28 


1814 


3600 


538 6 


784CIP2 28 


5995 


29 


1815 


3 £01 




784CIP2 29 


6005 


30. 


1615 


3 602 


53 8 8 


784CIP2 30 


6007 


31 


1817 ' 


3603 


5389 


/D4C1P2 31 


6007 


32 


1818 


3604 




/84C1P2 32 


6009 


33 


1819 


3605 


53 5 1 




6012 


34 


1820 


3606 


5392 


THA PTO") Oil 

/oflCliV 34 


6015 


35 


1821 


3G07 


5393 


/O^U-Ll*Z 33 


6016 


36 


1822 


3608 


5394 


/o1Lir« 3D 


6016* 


37 


1823 


3609 


5395 


*7R4 


6018 


38 


1824 


3610 


5396 


784ciP2~nr " 


oUlt) 


39 


1825 


3611 


5397 


784CIP2 %<} 


cm q 


40 


1826 


3612 


5398 


7B4CIP2 40 


OU/J 


41 


1827 


3613 


5399 


7S4CIP2 41 


6070 


42 


™ 1828 


3614 


5400 


794CIP2 42 


608 1 


43 


1829 


3615 


5401 


784C3P2 43 


608 9 


44 


1830 


3616 


5402 


784CIP2 44 


6118 


45 


1831 


3617 


5403 


784CIP2 45 


6118 


4* 


' 1832 


3618 


5404 


784CIP2 46" 


6130 " " 


47 - 


1833 


3619 


5405 


784CIP2 47 


6177 


48 


1834 


3620 


S406 


784CIP2 48 " 


6189 


49 


1835 


3621 


5407 [ 


784CIP2 49 


6191 


50 


1836 


3 622 


540G 


7846IP2 50 " 


6204 ~~ 


51 


1837 


3623 


5409 


784CIP2 51 - 


6204 


52 


183B 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2 53 


6367 


54 


184 0 


3626 


5412 


784CIP2_54 


6436 


55 


" 1841 


3627 


5413 


784CIP2 55 


6442 


S6 


1842 


3628 


5414 


784CIP2 56 


6445 


57 


1843 


3629 


5415 


784CJIP2 57 


6457 


58 


1844 


' 3630 


5416 


784CIP2 58 


6458 


59 


" 1845 " 


3631 


" 5417 


784CIP2 59 ■ 


6"4S6 ■ ■ - 
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NO; of 
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length 

peptide 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SBQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SBQ ID NO: in 
priority 
application 


*' SBQ ID 
NO: in 
U.S. S.N. 
09/488,725 




1846 


1 3632 


5418 


784CIP2 60 


6462 


61 


1847 




54 19 


784CIP2 61 


6472 


62 


184 8 


3634 


54 20 


784CIP2 62 


6499 


63 


1849 


Ti?T5 


3421 


784CIP2 63 


6499 


6V 


1850 


3636 


5422 


784CIP2 64 


6505 


65 






5423 


784CIP2 65 


6534 


66 




363 8 


5424 


7B4CIP2_66 


6534 ~" 


67 


J. OOJ 


g 

3639 


5425 


784CIP2_67 


6540 


68 




3640 


5426 . 


784C1P2_68 


6550 


69 


loo J 


3641 


5427 


784CIP2 69 


6550 


70 


1 856 


! 36-32 


542B 


784CIP2 70 


6592 




1 ob / 


3643 


5429 


784CIP2 71 


6645 


72 


1358 


3644 


5430 


784CIP2 72 


j 6671 


7^ 


1359 


! 3645 


5431 


784CIP2J73 


6763 




1860 


3646 


5432 


784CIP2 74 


6763 




1361 


3647 


5433 


784CIP2 75 


6786 


7g 


1362 


3648 


5434 


784CIP2 76 


6824 


/ / 


1863 


3649 


5435 


784CIP2 77 


6830 


/ a 


1864 


3650 


5436 


784CIP2 78 


6831 




1865 


3551 


5437 


784CIP2 79 


6832 


BO 


1866 


3652 


5438 


784CIP2 80 


6834 


81 


1867 


3653 


5439 


784CIP2_81 


6834 


82 


1858 


3654 


5440 


784CIP2 82 


683S~ ~~ 


83 


1859 


3655 


5441 


784CIP2_83 


6837 


84 


1870 


3656 


5442 


784CIP2 84 


6843 


85 


1871 


3657 


5443 


784CIP2 85 


6859 


86 


1872 


3658 


5444 


7B4CIP2 86 


6915 


87 


1873 


3659 


5445 


784CIP2 87 


6932 


88 


1874 


3660 


5446 


734C1P2 88 


6957 


89 


1875 


3661 


5447 


784C1P2 89 


6961 


yo 


1B76 


j 3662 


S448 


784CIP2_90 


6973 


Qi 


1877 


3663 


5449 


734CIP2 91 


6973 


yz 


1878 


3654 


5450 


7d4ClP2_53 


7007 




1879 


3S65 


5451 


784CIP2_94 


7018 


" 33 ~" 


1880 


3666 


5452 


784CIP2 95 


7019 


oc 


1881 


3667 


5453 


784CIP2 96 


7020 


Of 

so 


1882 


3668 


5454 


784CIP2_97 


7026 


97 


1883 


3669 


5455 


784CIP2 98 


7021 


98 


lo o4 


3670 


5456 


784CIP2_99 


7023 


99 


1885 


3671 


5457 


784CIP2_100 


7027 


100 


lot) o 


3672. 


5458 


784CIP2 101 


7028 


101 


1 R fl *7 
lot] / 


3673 


5459 


7B4CIP2 102 


7029 


102 


1 fl 0 ft 
loo O 


3674 


5460 


784CIP2 103 


7031 


103 


1889 


3675 


5461 


784CIP2 104 


7032 


104 


16*90 


3676 


5462 


784CIP2 105 


7033 


105 


1891 


3 677 


5463 


784CIP2 106 


7035 


106 


1892 


36>76 


5464 


784CIP2 107 


7036 


107 


1893 


jo /y 


5465 


784CIP2 108 


7039 


108 


1894 


3680 


5466 


784CIP2 109 


7043 


109 


1895 


Jo 01 


5467 


784CIP2 110 


7044 


110 


1896 




5468 


784CIP2 111 


7046 


111 


1897 


Jo 33 


5469 ! 


784CIP2 112 


7054 


112 


1898 




5470 


784CIP2 113 


7061 


113 


1899 


36B5 


5471 


784CIP2 114 


"7 a 7^ 


114 


1900 


3636 


5472 


784CIP2 115 


7092 


115 


1901 


3687 


5473 


7B4CIP2_116 


7094 


116 


1902 


3688 


5474 


784CIP2 117 


7106 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


5476 


784CIP2 119 


7111 


119 


190 B 


3691 


5477 


784CIP2_120 


7123 


120 


1906 " " 


3692 


5478 


784CIP2 121 


7142 


121 


" 1907 


3693 


5479 


~7B4CiP2 122 


7142 * 
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SEQ ID 
NO: in 
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09/488,725 


122 


1908 


3694 


5480 


/olCXfrV 123 


7154 


123 


1909 


3695 


5481 


7B4CIP2 124 


7160 


124 


1910 


3696 


5482 


/o4U1P2 125 


7169 


125 


1911 


3697 


54 83 


/H4CIP2 126 


7185 


126 - 


1912 


3698 


5484 


/o4U1P4 127 


7197 


127 


1913 


3699 


54 85 


704CIP2 128 


7219 


128 


1914 


'3 700 


54 86 


784CIP2 129 


u 7226 


125 


1915 


J / Ul 


C/ on 
3*1 O / 


784CIP2 130 


7229 


130 


1916 


3702 


CA Q Q 
34 Ob 


784CIP2 131 


7234 j 


131 


1917 


J / UJ 


5489 


784CIP2 132 


7235 


132 


1918 




5490 


784CIP2_133 


7235 


133 


i 1919 


J / U3 


5491 


784CIP2 134 


7238 


134 


1920 


3706 


5492 


784CIP2 135 


7247 


135 


Z.921 


3 /U / 


5493 


784CIP2 136 


7261 


136 


1 


3708 


5494 


784CIP2 137 


7262 


13 7 


XJ CO 


3709 


5495 


784CIP2 138 


72^7 


13B 


1924 


inn 


54 96 


784CIP2 139 


7272 


139 


1 4?< 


voTt 

3711 


54 97 


784CIP2 140 


7273 


14 0 


1926 


vtVo 

3712 


5498 . 


784CIP2_141 


7282 


141 




3713 


5499 


784CIP2_142 


7288 


142 


1928 


3714 


5500 


784CIP2 143 


7291 


143 




3715 


5501 


784CIP2_144 


7293 


144 




3716 


5S02 


7G4CIP2_145 


7294 


14S 




3717 


5503 


784CIP2 146 


7299 


146 


1932 


3728 


5504 


784CIP2 14 7 


73 00 


147 


1933 


3719 


5505 


784CIP2_148 


7312 


l48~~" 


1 934 


3720 


5506 


784CIP2 149 


7313 


149 




3721 


5507 


784CIP2 150 


7315 


150 




3722 


5SC8 


784CIP2 151 


7318 


151 


1937 


3723 


5509 


784CIP2 152 


7321 


152 


IQTfl 


3724 


5510 


784CIP2 153 


7330 


153 




3 725 


S511 


784CIP2_154 


7331 


154 


194 0 


3726 


5512 


784CIP2__155 


7333 


155 


1 PA 1 


3727 


5513 


784CIP2_15£ 


7350 


IS 6 


1942 


372 8 


5514 


784CIP2 157 


7352 


157 


1943 


3729 


5515 


784CIP2 158 


7384 


158 


1944 


n 

j / JU 


5516 


784CIP2JL59 


7403 


159 


1945 


3 731 


5517 


784CIP2 160 


7431 


160 


1946 


■j too 

J / JZ 


5518 


784CTP2 161 


7441 


l£l 


1947 


""4 1 "a 1 


5519 


784CIP2 162 


7453 


162 


1948 




5520 


784CIP2 163 


7467 


163 


1945 


3735 


5521 


784CIP2 164 


7471 


164 


1950 


3736 ' 


5522 


784CIP2 165 


7493 


i6"s 


1951 


\ 7*4*7 


5523 


7B4CIP2 166 


7S02 


166 


1952 


3 738 


ceo/ 


784CIP2 167 


7511 


167 


1953 1 


3739 


-> 3£3 


784CIP2 168 


7514 


168 


1954 


3740 


CMC 


784CIP2 169 


7520 


169 


1955 


3741 


ccon 

/ 


/o4CIP2 170 


7541 


170 


1956 


374 2 


JjZtJ 


784CIP2 171 


7570 


171 


1957 


3743 




784CIP2 172 


7578 


172 


1958 


3744 "" " 




/84CIP2 173 


7583 ! 


173 
^ 174 


1959 


3745" " • " 




/84CIP2 174 


7592 




1960 


3746 


5532 


784CIP2 175 


7601 


175 


1961 


" "3747 


5533 


784CIP2 176 


7602 


176 


1962 


3748 


5534 


7B4CIP2_177 


7608 


177 


1963 


i749 


5535 


784CIP2 178 


7615 


178 


1964 


3750 


55^36 


784CIP2_179 


7617 


179 


1565 


3751 


5537 


784CIP2 181 


7624 


180 


1966 


3 7*2 


s^s 


7B4CIP2 182 


7626 


181 


"13*7 


3 753 


5539 


784CIP2 183 


7640 


182 

181 


"1968 " 

— 1955 


3754 
3755 


5540 
5541 


7B4CIP2 184 
784CIP2 IBS 


7641 

7641 — j 
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SEQ ID NO: 
of full- 
lengta 


SEQ ID 
NO: of 
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SEQ ID NO: 
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sequence 


SBQ ID 
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Priority 
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priority 
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•SEQ ID 
NO: in 
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09/488,725 

1 


184 


1370 


O / JO 


5542 


784CIP2 186 


7641 


185 


1 <i*71 
i-j 1 X 


J757 


5543 


784CIP2__187 


7642 


186 




J /30 


5S44 


( 784CIP2_1B8 


7649 


187 


1973 




5545 


784CIP2 189 


7656 


188 


1974 


*/oll 


5546 


784CIP2 190 


7657 1 


189 


1 Q7C 

JLZf /_» 


3761 


5547 


784CIP2 191 


7657 


1 190 


13 it) 


3762 


5548 


784CIP2 192 


7662 


191 


X3 / / 


3763 


5549 


7 84CIP2_193 


1Mb 


192 




3764 


5550 


784CIP2 194 


7673 




1979 


3765 


5551 


7 84CIP2_195 


! 7690 




1980 


3766 


5552 


784CIP2_196 7700 


To? 


1981 


3767 


5553 


784CIP2 197 7709 


X9b 


1982 


3768 


5554 


784CIP2 198 


7736 


197 


1983 


3769 


5555 


784CIP2_199 


7737 


198 


1984 


3770 


5556 


784CIP2 200 


7744 


199 


1985 


3771 


5557 


784CIP2_201 


^771 


200 


1986 


1 3772 


5558 


784CIP2_202 


7786 


201 


1987 


3773 


5559 


784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2_204 


7797 


203 


1989 


3775 


5561 


784CIP2_205 


mot 


204 


1990 


3776 


5562 


784CIP2_206 


7812 


20S 


1991 


3777 


5563 


784dP2_207 


7812 


206 


1992 


3778 


5564 


784CIP2_208 


7818 


207 


1993 


3779 


5565 


784CIP2_209 


7822 


208 


1994 


3780 


5566 


784CIP2 210 


7827 


209 


1995 


3781 


t 5567 


784CIP2 211 


7830 


210 


1996 


3782 


5568 


784CIP2_212 


7835 


211 


1997 


3783 


5569 


784CIP2_214 


7840 


212 


1998 


3784 


5570 


784CIP2 215 


7858 


213 


1999 


3785 


5571 


784CIP2 216 


7858 


214 


2000 


3786 


5572 


784CIP2 217 


7861 


215 


2001 


3787 


5573 


784CIP2 218 


7866 


216 


2002 


3788 


5574 


784CIP2 219 


786-8 


217 


2003 


3789 


5575 


784CIP2_220 


7896 


218 


2004 | 


3790 


5576 


784CIP2 221 


7898 


219 


2005 


3791 


5577 




7900 


220 


2006 


3792 


5578 


784CIP2_223 


7906 


221 


2007 


3793 


5579 


784CIP2_224 


7908 


222 


2008 


3794 


5580 


784CIP2_225 


7909 


223 


2009 


3795 


5581 


784CIP2_226 


7917 


224 


*> ft i n 


3796 


5582 


784CIP2 227 


7932 


225 


9m l 
£uxi 


3797 


5583 


784CIP2_22 8 


7940 


226 


2012 


3798 


5584 


784CIP2 229 


7940 


227 


Oft"! t 


3799 


5585 


784CIP2 230 


7984 


228 




3800 


55B6 


784CIP2_231 


7984 


229 


2015 


JBU1 


5587 


784CIP2_232 


8001 


230 


2016 




5588 


784CIP2 233 


8021 


231 


2017 


■JOUJ 


5589 


784CIP2_234 


8029 


232 


2018 




5590 


784CIP2 235 


8033 


233 


2019 


"a one 1 


5591 


784CIP2 236 


8040 


234 




3806 


5592 


784CIP2 237 


8052 


235 


2021 


3807 


5593 


784CIP2_238 


8096 


236 


2022 


3808 


5594 


784CIP2_239 


8096 


237 


2023 


3809 


5595 


784CIP2_240 


8113 


238 


2024 


"3fllO 


5596 


784CIP2 241 


8126 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2 243 


8137 


241 


2027 


3813 [ 


5599 


7B4CIP2_244 


8137 


242 


2028 


3814 


5600 


784CIP2 245 


815 9 


243 


2029 


3815 


5501 


784CIP2 246 


8159 


_ 244 " 


2030 


3816 


5602 


784CIP2 247 


8161 


245 


2031 


3817 


5603 


784CIP2_24B 


8176 
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246 


2032 


3818 


5604 


7B4CIP2 249 


8196 


247 


2033 


3819 


5605 


784ClP2_2*6 


8200 


248 


2034 


3820 


5606 


784CIP2 251 


8212 


249 


2035 


3821 


5607 


784CIP2 252 


8220 


250 


2036 


3822 


5608 


784CIP2_253 


8238 


251 


2037 


3823 


5609 


784CIP2_2S4 


8254 


252 


2038 


3824 


5610 


784CIP2 255 


8255 


253 


2039 


3825 


5611 


784CIP2 256 


8288 


254 


2040 


3826 


5612 


784CTP2_2S7 


B296 


255 


2041 


3827 


5613 


784CIP2__258 


8329 


256 


2042 


3828 


5614 


784CIP2 259 


8362 


257 


2043 


3829 


561 Ei 


784CIP2 260 


8429 


258 


2044 


3830 


5616 


784CIP2 261 


8436 ' 


259 


2045 


3831 


5617 


784CIP2_262 


8448 


260 


2046 


3832 


5618 


. 784CIP2_263 


8472 


261 


2047 


3833 


5619 


784CIP2_264 


8502 


262 


2048 


3834 


5620 


784CIP2_265 


8504 


263 


2049 


3835 


5621 


784CIP2 266 


3507 


264 


2050 


3836 


5622 


784CIP2_268 


8509 


265 


2051 


3837 


5623 


784CIP2 269 


8515 


266 


2052 


3838 


5624 


784CIP2_270 


8519 


267 


2053 


3839 


5625 


784CIP2_271 


8530 


268 


2054 


3840 


5626 


784C:iP2 272 


8532 


269 


2055 


3841 


5621 


784CIP2 273 


853 2 


270 


2055 


3842 


5628 


784CIP2 274 


8539 


| 271 


2057 


3843 


5629 


784CIP2 275 


8541 


272 


2058 


3844 


5630 


784CIP2_276 


8543 


273 


2 059 


3845 


5631 


784CIP2 277 


8593 


274 


2060 


3846 


5632 


784CIP2 278 


8595 


275 


2061 


3847 


j 5633 


784CIP2_279 


8615 


276 


2062 


3848 


5634 


784CIP2 280 


8620 


277 


2063 


3849 


5635 


~ 784C1P2 281 


8621 


278 


2064 


3850 


" 5636 


784CIP2_282 


8623 


279 


2065 


3851 


5637 


784CIP2_283 


8625 


280 


2066 


3852 


5638 


784CIP2 284 


8628 


281 


2067 


3853 


5639 


784CIP2_285 


8628 


282 


2068 


38*4 


5640 


784CIP2 286 


8629 


283 


2069 


3855 


5641 


784CIP2 287 


8630 


284 


2070 


3656 


5642 


784CIP2 268 


8631 


285 


2071 


3857 


5643 


784CIP2_289 


8633 


286 


2072 


3858 


5644 


784CIP2 290 


8634 


287 


2073 


3859 


5645 


784CIP2 291 


8635 


288 


2074 


3860 


5S46 


784CIP2 292 


8636 


289 


2075 


3861 


5647 


784CIP2_293 


8659 j 


290 


2076 


3862 


5648 


784CIP2_294 


8660 


291 


2077 


3863 


5649 


784CIP2 295 


B667 


292 


2078 


3864 


5650 


784CIP2_296 


8667 


293 


2079 


3865 


5651 


784CIP2 297 


8685 


294 


2080 


3866 


5652 


784CIP2 298 


8805 


295 


2081 


3867 


5653 


784CIP2_299 


8896 


296 


2082 


3668 


5654 


784CIP2_300 


8978 


297 


2083 


3869 | 


5655 


784CIP2JJ01 


9046 


298 


2084 


3870 


5656 


784CIP2 302 


9048 


299 


2085 


JO/1 


r"?*'* 7 ? 

5o»7 


784CIP2 303 


9116 


300 


2086 


3872 


5658 | 


784CIP2 304 


9195 


301 


2087 


3873 


5659 


7B4CIP2_305 


9201 


302 


2088 


3874 


5660 


7B4CIP2 306 


9307 


303 


2089 


3B1£ 


5661 


7B4CIP2_307 


9321 


304 


20SO 


3876 


5652 


7B4CIP2 308 


9397 


305 


2091 


3877 


5663 


784CIP2 309 


9405 


306 


2032 


3878 


5664 


784CIP2 310 


9406 


307 


2093 


3879 


5665 


784CIP2 311 


9422 
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seq Id no:~ 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
WO: of 
full- 
length ' 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


308 


2094 


3880 


5666 


| 784CIP2_312 


9494 


309 


2095 


3861 


56^*7 


784CIP2JS13 


9512 


310 


2096 


3882 


5668 


784CIP2_314 


9632 


311 


2097 


3883 


5669 


784CIP2 31S 


9661 


312 


2098 


3884 


5670 


7B4CIP2_316 


9664 


313 


2099 


3885 


5671 


784CIP2 317 


9691 


314 


2100 


3886 


5672 


• 784CIP2 318 


9700- 


315 


2101 


3887 


5673 


784CIP2 319 


9716 


316 


2102 


3888 


5674 


784CIP2_320 


9721 


317 


2103 


3889 


5675 


784CIP2 321 


9870 


318 


2104 


3890 


5676 


784CIP2 32-2" 


9887 


319 


^ 2105 


3891 


5677 


784CIP2J323 


9923 


320 


2106 


3892 


5678 


784CIP2 324 


9938 


321 


2107 


3893 


| 5679 


784CIP2 325 


9964 


322 


2108 


3894 


5680 


784CIP2 326 


L0007 


323 


2109 


3 895 


5681 


784CIP2_327 


10009 


324 


2110 


3896 


5682 


784CIP2 328 


10046 


32S 


2111 


3897 


5683 


784CIP2 329 


10156 


326 


2112 


3896 


5684 


784CIP2_330 


10276 


327 


2113 


3899 


5685 


784CIP2 331 


10283 


328 


2114 


3900 


5686 


7B4CIP2B_1 


152 


323 


2115 


3901 


5687 


784CIP2B_2 


167 


330 


2116 


3902 


5688 


7B4CIP2B 3 


205 


331 


| 2117 


3903 


5689 


784CIP2B 4 


210 


332 


1 2118 


3904 


S690 


734CIP2B_5 


225 


333 


i 2119 


3905 


5691 


784CIP2B_6 


226 


334 


2120 


3906 


5692 


784CIP2B_7 


264 


335 


2121 


3907 


" 5^93 


784CIP23 8 


268 


336 


2122 


3908 


5694 


784CIP2B 9 


293 


337 


2123 


3909 


5£i9S 


784CIP2B_10 


293 


338 


2124 


3910 


5696 


784CIP2B 11 


293 


339 


2125 


3911 


5697 


" 784CIP2B 12 


302 


340 


2126 


3912 


! 5698 


7B4CIP2B_13 


311 


341 


2127 


3913 


5699 


784CIP2B_14 


352 


342 


2128 


3914 


5700 


784CIP2B 15 


358 


343 


2129 


3915 


5701 


7B4CIP2B 16 


368 


344 


2130 


3916 


5702 


784CIP2B_17 


393 


345 


2131 


3917 


5703 


784CIP2B_18 


477 


346 


2132 


3918 


5704 


784CIP2B_19 


508 


347 


2133 


3919 


5705 


784CIP2B 20 " 


508 


348 


2134 


3920 


5706 


784CIP2B_21 


515 


349 


2135 


3921 


5707 


784CIP2B 22 


578 


350 


2136 


3922 


5708 


784CIP2B 23 


588 


351 


2137 


3923 


5709 


784CIP2B 24 


591 


352 


2138 


3924 


5710 


784CIP2B 25 


593 


353 


2139 


392S 


S711 


784CIP2B 26 


594 


354 


2140 


3926 


5712 


7B4CIP2B_27 | 


619 


355 


2141 


3927 


5713 


784CIP2B_2B 


620 


356 


2142 


3928 


5714 


784CIP2B_29 


554 


357 


2143 


3 929 


S7l£ 


784CIP2B_30 


692 


358 


2144 


3930 


5716 


784CIP2B_31 


753 


359 


2145 


3931 


5717 


784CIP2B 32 


758 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


361 


2147 


3933 


5719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


7B4CIP2B 35 


838 


363 


2149 


3935 


5721 


7B4CIP2B_36 


870 


364 


2150 


3936 


5722 


7B4CIP2BJJ7 


891 


365 


2151 


3937 


5723 


784CIP2BJJ8 


891 


366 


2152 


3938 


" 5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B 41 


932 


369 


215* 


3941 


5727 


784CIP2B 42 


942 
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SEQ ID NO: 
of cull- 
length 
nucleotide 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


" SEQ ID 
NO: in 
O. S.S.N. 
09/488,725 


370 


2lS6~ 


3 942 


5728 


784CIP2B 43 


958 


371 


4. 13 / 


3943 


5729 


784CIP2B_44 


i 968 


372 


£. ISO 




5730 


784CIP2B 45 


992 


373 






5731 


794CIP2B_46 


1025 


374 


2160 


"3 OA C 


5732 


\ 784CIP2B_ 47 


1074 


375 




3947 


5733 


784CIP2B_48 


1104 


376 


2 162 


394 8 


5734 


784CIP2B 49 


1114 


377 




3949 


573 5 


784CIP2B 50 


1144 


378 


2164 




573 6 


784CIP2B 51 


1262 


*?7<J 

3 1 ? 




3951 


573 7 


784CIP2B_52 


1318 


"inn 


2166 


3952 


5738 


784CIP2B 53 


1319 




2167 


3953 


573 9 


784CIP2B 54 


1328 




2168 


3954 


574 0 


784CIP2B 55 


1436 


•JOT 

j a s 


2169 


3955 


5741 


784CIP2B_56 


1464 




2170 


3956 


5742 


784CIP2B 57 


1S84 


IOC 


2171 


i 3957 


5743 


784CIP2B 58 


1617 


Job 


2172 


3958 


5744 


784CIP2B_59 


1724 


387 


2173 


3959 


5745 


784CIP2B_60 


1728 


1 O Q 


2174 


3960 


5746 


784CIP2B 61 


1772 


389 


2175 


3961 


5747 


784CIP2B 62 


i 1809 


390 


2176 


3962 


5748 


784CIP2B 63 


1868 


391 


2177 


3963 


5749 


784CIP2B 64 


1898 


392 


2178 


3964 


5750 


784CIP2B e'S 


1926 


393 


2179 


3 965 


5751 


784CIP2B 66 


1965 


394 


2180 


3 966 


5752 


7o4ClP2B 67 


1967 


39S 


2181 


3967 


5753 


784CIP2B 68 


1995 


396 


2182 


3968 


5754 


784CIP2B_69 


2005. 


397 


2183 


3 969 


5755 


784CIP2B 70 


2027 


398 


2184 


3970 


5756 


784CIP2B_1l 


2055 


3 99 


2185 


3971 


5757 


784CIP2B 72 


2103 


400 


2186 


3 972 


5758 


784CIP2B_73 


2106 


401 


2187 


3973 


5759 


784CIP2B_74 


2166 


402 


2188 


3974 


5760 


784CIP2B 1$ 


2175 


403 


2189 


3 975 


5761 


784CIP2B76 


2176 


404 


2190 


3976 


5762 


784CIP2B 78 


2236* 


4 05 


2191 


3977 


5763 


784CIP2B 79 


2250 




2192 


3978 


5764 1 


784CIP2B 80 


2306 . 


407 


2193 


3979 


5755 


784CIP2B 81 


2323 


4 08 


2194 


3980 


5756 


784CIP2B 82 


2340 


409 


2195 


3981 


5767 


784CIP2B 83 


2371 


410 


2196 


3982 


5768 


784CIP2B 84 


2399 


411 


«19 / 


3 983 


5769 


784CIP2B 85 


2411 


412 


ni an 
■c, X j a 


3 984 


5770 


784CIP2B 86 


2428 


413 




3 985 


5771 


784CIP2B 87 


2430 


414 




3986 


5772 


784CIP2B 88 


2439 


415 


2201 


3987 


5773 


784CIP2B 89 


2447 


416 


2262 


3 98 8 


5774 


784CIP2B 90 


2461 


417 


2203 




5775 


794CIP2B 91 


2487 


418 


2204 


3990 


5775 


784CIP2B_92 


2492 


419 


2205 i 




5777 


784CIP2B 93 


2512 


| 420 


2206 


3 992 


5778 


784CIP2B_94 


2564 


421 


2207 


3993 


5779 


784CIP2B__95 


2678 


422' 


2208 


3994 


5780 ! 


784CIP2B 96 


2816 


423 • 


2209 


3995 


5781 


784CIP?"B 97 


*olo 


424 


2210 


3996 


5782 


784CIP2B 98 


2819 


425 


2211 


3997 


5783 


784CIP2B 99 


2943 


426 


2212 


3998 


5784 


784CIP2B 100 


3137 


427 


2213 


3999 


5785 


784CIP2B 101 


3137 " 


428 


2214 


4000 


578* 


784CIP2B 102 


3160 


429 


221* 


4001 


5787 


784CIP2B_103 


3323 


430 


2216 


4002 


5788 


784CIP2B 104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


3362 ■ 
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SEQ ID NO: 
of full- 
length 

sequence 


SBQ ID 
NO: of 
full- 
length 


SEO ID NO: 
of contig 
nucleotide 
sequence 


SBQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
c or res ponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
D. S.S.N. 
09/488,725 


432 


2218 


4004 


5790 


/D9CIF2B 106 


3417 


433 


2219 


4005 


3 / Si 


/04CLP4B 107 


3418 


434 


2220 


Anne 


3 / j 4 


794CIP2B 108 


3442 


43S 


2221 


4007 


3 / Jr.* 


/oflLliVB 109 


3442 


436 


2222 


Anna 


3 /yi 


784CIP2B 110 


3444 


437 


2223 


Ann q 


5795 


7B4CIP2B 111 


3855 


43 8 


1 7994 

4 44*S 


4010 


5796 


784CIP2B 112 


3863 


439 


4445 


4011 


5797 


784CIP2B 113 


4090 


440 


999G 
4440 


Ann •> 

4UJL 4 


5798 


784CIP2B 114 


4105 


441 


444 / 


4013 


5799 


784CIP2B 115 


] 4142 


44 9 


444a 


4014 


S800 


7B4CIP2B 116 


4142 


**** J 


2229 


4015 


5801 


784CIP2B 117 


4149 


AAA 


ooTn 


4016 


5802 


784CIP2B 118 


4196 


A A C 


2231 


4017 


5803 


784CIP2B 119 


4202 


44~G 


->oo-> 

2232 


4018 


5804 


784CIP2B 120 


4274 


A A 1 


2233 


4019 


5805 


784CIP2B_121 


4304 


*kH a 


2234 


4020 


5806 


784CIP2B 122 


4306 


44 9 


2235 


4021 


5607 


784CIP2B_123 


4311 




2236 


4022 


5808 


784CIP2B 124 


4321 




2237 


4023 


5809 


784CIP2B 125 


4323 


4S2 


2238 


4024 


5810 


784CIP2B_126 


4332 


453 


2239 


4025 


5811 


784CIP2B 127 


4488 


454 


2240 


4026 


5812 


784CIP2B_128 


4588 


455 


2241 


4027 


5B13 


784CIP2B 129 


5569 


456 


2242 


4028 


5814 


784CIP2B_130 


5573 


457 


2243 


4029 


5815 


784CIP2B 131 


5577 [ 


458 


2244 


4030 


5816 


784CIP2B_132 


5579 


459 


2245 


4031 


5817 


784CIP2B_133 


5582 


460 


2246 


4032 


5818 


784CIP2B_134 


5583 


461 


22 47 


4033 


5819 


784CIP2B_135 


5584 


a£9 


2248 


4034 


5820 


784CIP2B 136 


5585 


4 b J 


2249 


4035 


5821 


784CIP2B_137 


5591 


ACA ' 


2250 


4036 


5822 


784CIP2B_138 


5593 


aEr 

465 


2251 


403 7 


5823 


784CIP2B_139 


5594 


" T^c 


2252 


4038 


5824 


784CIP2B 140 


5594 


ACT 


2253 


4039 


5825 


784CIP2B_141 


5598 


AC ft 


2254 


4040 


5826 


784CIP2B_142 


5602 


ACQ 


2255 


4041 


5827 


784CIP2B 143 


5605 


470 


2256 


4042 


5828 


784CIP2B_144 


5608 


471 


44!> A 


4043 


5829 


784CIP2B_145 


5617 


472 1 


9 9 Cft 
4 430 


A OA A 

*044 


bo J 0 


704CIP2B 146 


5620 


473 


4497 


4045 


5831 


784CIP2B 147 


5622 


474 


2260 




30 3 4 


784CIP2B 148 


5623 


475 


2261 


4047 


5833 


784CIP2B 149 


5C24 


476 


2262 


a n a a 


5634 


784CIP2B 150 


5625 


477 


2263 


An AQ 


5835 


784CIP2B__151 


5627 


478 


2264 


a niin 

HU3U 




7 84CIP2B 152 


5628 


479 


♦ 2265 


4051 


con 


/U4CIP4B 153 


5630 


480 


2266 


4052 


CDIfl 


/o4L-ljb'2B 154 


5632 


481 


2267 


4053 


5839 


784CIP2B 155 


5640 


482 


2268 


4 054 


5840 


7B4CIP2B 156 


5641 


463 


2269 


a ncc 


5841 


784CIP2B 157 


5643 


484 


2270 




5842 


784CIP2B 158 


5647 


485 


2271 


4057 


5843 


784CIP2B 159 


5649 


486 


2272 


4058 


5844 


784CIP2B_160 


S658 


487 


2273 


4059 


5845 


784CIP2B 161 


5659 


488 


2274 


4060 


5846 


784CIP2B_J.62 


5667 


489 


2275 


4051 


5847 


784CIP2B_163 


5672 


490 


2276 


4062 


5848 


784C1P2B_164 


S*1l 


.491 


2277 


4063 


5849 


784CIP2B 165 


5678 


492 ' 


2278 


4064 


5BS0 


784CIP2B 166 


5680 


493 


2279 


4065 | 5851 


784CIP2B 167 


5684 * 1 
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SEQ ID NO: 
of full- 
length 
nucleotide 

sequence 


SBQ ID 
NO: of 
full- 
length 
peptide 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ Id 
NO: in 
U.S. S.N. 
09/488,725 


494 


2280 




5852 


784CIP2B 168 


5686 


4 95 




2281 


4067 


— f 0 c-j 

5853 


784CIP2B 169 


5694 






4068 


5854 


784CIP2B 170 


5698 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 


498 




4070 


5856 


784CIP2B 172 


5712 


4 99 


& bD3 


4071 


5857 


784CIP2B 173 


5719 


5oo 




4072 


5858 


7B4CIP2B 174 


5720 




2287 


4073 


5859 


7B4CIP2B 175 


5727 






4074 


5860 


784CIP2B 176 


5730 




1 2289 


4075 


5861 


! 784CIP2B_177 


5734 


504 


2290 


4076 


5862 


784CIP2B 178 


J 5738 


505 


2291 


4077 


S863 


784CIP2B 179 


5739 




2292 


4078 


5864 


784CIP2B 1B0 


5740 


507 


2293 


4079 


5865 


784CIP2B 181 


5744 


508 


2294 


4080 


5866 


784CIP2B_182 


5748 


509 


2295 


4081 


5867 


784CIP2B__183 


5749 


510 


2296 


4082 


5868 


784CIP2B 184 


5750 


511 


2297 


4083 


5869 


784CIP2B_185 


5750 


512 


2298 


4084 


5870 


7B4CIP2B 186 


5750 


513 


2299 


4085 


5871 


7B4CIP2B_187 


5761 


514 


2300 


4086 


5872 


7B4CIP2B_188 


5762 


515 


23 01 


4087 


5873 


784CIP2B__189 


5767 


516 


23 02 


4088 


5874 


7B4CIP2B 190 


5773 


517 


2303 


4089 


5875 


784CIP2B 191 


5783 


51 B 


2304 


4090 


5876 


' 7B4CIP2B 192 


5784 


519 


2305 


4091 


5877 


784CIP2B 193 


5788 


520 


2306 


4032 


5878 


784CIP2B 194 


5798 


521 


2307 


4093 


5879 


784CIP2B 196 


5807 


522 


2308 


4094 


5880 


784CIP2B_197 


5818 


523 


2309 


4095 


5881 


784CIP2B 198 


5819 


524 


2310 


4096 


5862 


784CIP2B_199 


5827 


Hoc 

525 


2311 


4097 


5883 


784CIP2B 200 


5828 


526 


2312 


4098 


5884 


7 84CIP2B_201 


Sfl42 


527 


2313 


4099 


5885 


784CIP2B_202 


5853 


528 


2314 


4100 


5886 


7B4CIP2B 203 


5861 


529 


23 15 


4101 


5887 


7 84CIP2B_204 


5864 


530 


2316 


4102 


5888 


7 84CIP2B_205 " 


5B65 


531 


2317 


4103 


5889 


7B4CIP2B 206 


5871 


532 


2318 


4104 


5890 


784CIP2B 207 


5873 




2319 


4105 


5891 


784CIP2B 208 


5873 


534 




4106 


5892 


784CIP2B 209 


5875 




ZJ Zl 


4107 


5893 


784CIP2B 210 


5878 


536 


ZoZ£ 


4108 


5894 


784CIP2B_211 ( 


5879 


3J / 




4109 


5895 


784CIP2B_212 


5880 


53 B 




4110 


5896 


784CIP2B_213 


5880 


539 




4111 


5897 


784CIP2B 214 


5880 


540 


^ j Z b 


4112 


5898 


7 84CIP2B_215 


5B80 


541 


2327 


4113 


5399 


784CIP2B 216 


5885 


542 




4114 


5900 


784CIP2B_217 


5895 


543 


* 3*7 


4115 


5901 


784CIP2B 218 


5898 


544 


2 3 30 


4116 


5902 


784CIP2B 219 


5902 


545 


2331 


4117 


5903 


784CIP2B 220 


5904 


546 


2332 


4118 


5904 


784CIP2B 221 


5918 


547 


2333 


4119 


5905 




5921 


548 


2334 


4120 


590£ 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B 224 


5932 


550 


2336 


4122 


5908 


7B4CIP2B 225 


5939 


551 


2337 


4123 


5909 


7B4CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CIP2B 227 


5946 


553 


2339 


4125 


5911 


784CIP2B_228 


5947 


554 


2340 


4126 


5912 


784CIP2B_229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 \ 


5967 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




556 


2342 


4128 


5914 


784CIP2B_232 


597 S 


557 


2343 


4129 


5915 


784CIP2B_233 


5977 


558 


2344 


4130 


5916 


784CIP2B 234 


5978 


559 


2345 


4131 


5917 


784CIP2B 235 


5979 


560 


2346 


4132 


5918 


784CIP2B 236 


59S0 


561 


2347 


4133 


5919 


784C1P2B 237 


5988 


562 


2348 


4134 


5920 


j 784CIP2B 238 


5989 


563 


2349 


4135 


5921 


784CIP2B 239 


5991 


564 


2350 


4136 


5922 


784CIP2B_240 


5997 * 


565 


2351 


4137 


5923 


784CIE2B_241 


5998 


566 


2352 


4138 


5924 


784CIP2B 242 


6003 


567 


2353 


4139 


5925 


784CIP2B__243 


6004 


568 


2354 


4140 


5926 


784CIP2B 244 


6013 


569 


2355 


4141 


5927 


784CIF2B 245 


6028 


570 


2356 


4142 


5928 


7 84CIP2B_24 6 


6028 


571 


2357 


4143 


5929 


784CIP2B_247 


6029 


572 


2358 


4144 


5930 


784CIP2B_24B 


6031 


573 


2359 


4145 


5931 


784CrP2B_249 


*031 


574 


23*0 


4146 


5932 


784CIP2B 250 


6032 


57S 


2361 


4147 


5933 


784CIP2B_251 


6037 


576 


2362 


4148 


5934 


784CIP2B_252 


6037 


577 


2363 


4149 


5935 


7B4CIP2B_253 


6043 


578 '- 


23*4 


4150 


5936 


784CIP2B 254 


6044 


579 


2365 


4151 


5937 


7B4CIP2B 255 


6046 


580 


2366 


4152 


5938 


784CIP2B_256 


6048 


581 


2367 


4153 


5939 


784CIP2B 257 


6049 


582 


2368 


41*4 


5940 


784CIP2B 2*8 


6051 


583 


2369 


4155 


5941 


784CIP2B 259 


6053 


584 


2370 


4155 


5942 


784CIP2B 260 


j 6060 


585 


2371 


4157 


5943 


784CIP2B 261 


6063 


586 


2372 


4158 


5944 


784CIP2B 2*2 


6066 


*87 


2373 


4159 


5945 


784CIP2B_263 


6067 


588 


2374 


4160 


5946 


784CIP2B 2*4 


6068 


589 


~ 2375 


4161 


5947 


784CIP2B_265 


6073 


590 


23 76 


4162 


594 8 


7 84CIP2B w 2 66 


6076 


591 


2377 


41*3 


5949 


784CIP2B 267 


6076 


592 


2378 


4164 


5950 


784CIP2B_268 


6077 


593 


2379 


41*5' 


5951 


784CIP2B_269 


6079 


594 


2380 


4166 


5952 


784C1P2B 270 


6082 


595 


2381 


41*? 


5953 


784CIP2B 272 


6088 


596 


2382 


4168 


5954 


784CIP2B_273 


6091 


597 


2303 


41*9 


5955 


784CIP2B_274 


6094 


598 


2384 


4170 


5956 


784CIP2B_275 


6101 


599 


2385 


4171 


$957 


784CIP2B_276 


6103 


600 


2386 


4172 


5958 


784CIP2B 277 


6104 


601 


2387 


4173 


5959 


784CIP2B_278 


6108 


602 


2388 


4174 


5960 


784CIP2B_279 


6112 


603 


2389 


4175 


5961 


784CIP2B 280 


6121 


604 


2390 


4176 


5962 


784CIP2B 281 


*125 


605 


2391 


4177 


5963 


784CIP2B_282 


6126 


606 


2392 


4178 


5954 


784CIP2B__283 


6128 


607 


2393 


4179 


59*5 


784CIP2BJ284 


6129 


608 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 


4181 


5967 


764CIP2B 286 


6133 


610 


2396 


4182 


5968 


7B4CIP2B 287 


6135 


611 


2397 


4183 


5969 


784CIP2BJ2B8 


6139 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4185 


5971 


7B4CIP2B 290 


6145 


614 


2400 


4186 


5972 


784CIP2B_291 


6146 


615 


2401 


4187 


5973 


784CIP2B 292 


6148 


616 


2402 


4188 


5974 


784CIP2B 293 


6149 


617 


24d3 . 


4189 


5975 


784CIP2B 294 


6149 
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SEQ ID NO: 
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nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
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SEQ ID NOt in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


^18 


24 04 


4190 


5976 


784CIP2B_295 


6153 




2405 


4191 


5977 


7B4CIP2B 296 


6159 




^4Ub 


4192 


5978 


784CIP2B 297 


6164 


041 


24 07 


4193 


5979 


7B4CIP2B 298 


6167 




2408 


4194 


5980 


784CIP2B_299 


6172 




2409 


4195 


5981 


784CIP2B 300 


6173 


624 


2410 


4196 


5982 


784CIP2B 301 


6190 


£od 
04b 


2411 


4197 


5983 


7B4CIP2B 302 


6194 


coc 
62o 


2412 


4198 


5984 


784CIP2B 303 


6196 


627 


2413 


4199 


5985 


784CIP2B 304 


6197 


628 


2414 


4200 


5986 


784CIP2B 305 


6198 


629 


2415 


4201 


5987 


784CIP2B 306 


6198 


630 


2416 


4202 


5388 


784CIP2B 308 


6214 


631 


2417 


4203 


5989 


784CIP2B_309 


6213 


632 


2418 


4204 


5990 


784CIP2B_310 


6219 


633 


2419 


4205 


5991 


784CIP2B_311 


6226 


634 


2420 


4206 


5992 


784CIP2B 312 


6229 


535 


2421 


4207 


5993 


784CIP2B_313 


6234 


536 


2422 


4208 


S994 


7B4CIP2B 314 


6237 


537 


2423 


4209 


5995 


784CIP2B 315 


6238 


538 


2424 


L_ 4210 


5996 


7B4CIP2B 316 


6239 


639 


2425 


4211 


5997 


784CIP2B_317 


£239 


640 


2426 


4212 


5998 


784CIP2B 318 


6239 


641 


2427 


4213 


5999 


784CIP2B 319" 


6240 


642 


2428 


4214 


6000 


184CIP2B 320 


6244 


643 


2429 


4215 


6001 


784CIP2B 321 


6245 


644 


2430 


4216 


6002 


784CIP2B_322 


j 6250 


645 


2431 


4217 


6003 


784CIP2B 323 


6252 


646 


2432 


4218 


6004 


784CIP2B 324 


6252 


647 


2433 


4219 


6005 


784CIP2B 325 


6256 


648 


2434 


4220 


6006 


784CIP2B_326 


6260 


649 


2435 


4221 


6007 


784CIP2B 327 


6251 " 


650 


2436 


4222 


6008 


7«UCIP2B 328 


6264 ■ 


651 


2437 


4223 


6009 


784CIP2B 329 


6265 


6^52 


2438 


4224 


5010 


784CIP2B_330 


6266 


653 


2439 


4225 


6011 


784CIP2B 331 


6270 


654 


2440 


4226 


6012 


784CIP2B 332 


6271 


655 


2441 


4227 


6013 


784.CIP2B 334 


6274 


656 


2442 


4228 


6014 


784CIP2B 335 


6276 


657 


2443 


4229 


6015 


784CIP2B__336 


6281 




2444 


4230 


6016 


784CIP2B_337 


62B1 




2445 


4231 


6017 


784CIP2B 338 


6288 


Gen ' " 

0 bU 


2446 


4232 


6013 


784CIP2B_339 


6292 


Obi 


2447 


4233 


6019 


784CIP2B_340 


6294 


662 


244 B 


4234 


6020 


7B4CIP2B_343 


5312 


0 bj 


2449 


4235 


6021 


784CIP2B_344 


6312 





2450 


4236 


6022 


7B4CIP2BJ345 


6312 


665 




4237 


6023 


784CIP2B 346 


6322 


6(>€> 




4238 


6024 


784CIP2B 347 


6324 


657 


2453 


4239 


6025 


784CIP2B 349 


6329 


6H8 


2454 


4240 


6026 


784CIP2B 350 


6331 


659 




4241 


6027 


784CIP2B 351 


6333 


670 




4242 


6028 


7B4CIP2B 352 


6334 


671 


2457 


4243 


6029 


784CIP2B_353 


6337 


672 


2458 


4244 


d030 


7B4CIP2B 354 


6339 


673 


2459 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356 


6348 


675 


2461 


4247 


6033 


" 7B4diP2B 357 


" 6*348 


676 


2462 


4248 


£034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


7B4CIP2BJ559 


6351 


678 


2464 


4250 


6035 


784CIP2B 360 


6355 


679 


2465 — ■ 


4251 " " 


6037 


7d4CIP2B 36-1 | 63(52 
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sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 

docket number^ 
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SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


660 


2466 


4252 


6038 


784CIP2B_362 


6368 


Pol 


2467 


4253 


6039 


| 784CIP2B 363 


6369 


~V"a? 

OOfi 


2468 


4254 


6040 


784CIP2B 364 


6371 


DO J 


2469 


4255 


6*041 


784CIP2B 365 


6376 




2470 


4256 


6042 


784CIP2B 3 66 


63 79 


Cue 

DOS 


2471 


4257 


| 6043 


784CIP2B 367 


6380 


ooo 


2472 


4258 


6044 


784CIP2B 368 


6381 


6B7 


2473 


4259 


6045 


784CIP2B_369 


6392 


6 8 8 


2474 


4260 


6046 


784CIP2B_370 


6395 


689 


2475 


4261 


6047 


784CIP2B 371 


6397 


690 


2476 


4262 


6048 


784CIP2B 372 


6400 


691 


2477 


4263 


6049 


7B4CIP2B_373 


j 6401 


692 


2478 


4264 


6050 


704CIP2B 374 


6411 


693 


I 2479 


4265 


6051 


784CIP2B 375 


6411 


694 


2480 


4266 


G052 


784CIP2B_376 


6411 


695 


2481 


4267 


6053 


7B4CIP2B 377" 


6416 


696 


j 2482 


4268 


6054 


784CIP2B 378 


641B 


697 


24 83 


4269 


6055 


784CIP2B 379 


6422 


696 


2484 


4270 


6056- 


7B4CIP2B_38 0 


6423 


699 


2485 


4271 


6057 


784CIP2BJ381 


6426 


700 


2486 


4272 


6058 


784CIP2B 382 


1 6427 


701 


2487 


4273 


60S9 


784CIP2B 383 


6428 


702 


2438 


4274 ' 


6040 


784C1P2B_384 


6429 


703 


2489 


4275 


6061 


784CIP2B 3B5 


6430 


704 


2490 


4276 


40*2 


784CIP2B 3 86 


6432 


705 


2491 


4277 


6063 


784CIP2B 367 


6432 


706 


2492 


4278 


6064 


784CIP2B 388 


6438 


707 


2493 


4279 


6065 


784CIP2B_3 89 


6441 


708 


2494 


4280 


6066 


784CIP2B 390 


6446 


709 


2495 


4281 


6067 


784CIP2B 391 


6454 


710 


2496 


4282 


6068 


784CIP2B 392 


6459 


711 


2497 


4283 


6069 


784CIP2B 394 


6461 


712 


2498 


4284 


6070 


784Ci:>2B 395 


£467 


713 


2499 


4285 


6071 


784CIP2B_396 


646B 


714 


2500 


4286 


6072 


784CIP2B 397 


6487 


715 


2501 


4287 


6073 


784CIP2B_398 


6491 


716 


2502 


4288 


6074 


78 4CIP2B 399 


6506- 


717 


2503 


4289 


6075 


784CIP2B 401 


6514 


71B 

nT7i 


2504 


4290 


6076 


7B4CIP2B_402 


6519 


719 


2505 


4291 


6077 


7B4CIP2B__403 


6521 


720 


2506 


4292 


6078 


7B4CIP2B 404 


6532 


721 


2507 


4293 


6079 


784CIP2B 405 


6536 


i zz 


2508 t 


4294 


6080 


784CIP2B_406 


6543 


/.zj 


2509 


4295 


6081 


784CIP2B_407 


6544 


iia 
fZ*i 


2510 


4296 


6082 


784CIP2B 4 08 


6548 


1 z.z> 


2511 


4297 


6083 


784CIP2B_4 09 


6551 


'ZD 


2512 


4298 


6084 


784CIP2B 410 


5551 | 


727 i 


1 C 1 1 

Zbl^ 


4299 


60B5 


784CIP2B 411 


6552 


726 


2514 


4300 


6086 


784CIP2B 412 


5554 


77Q 


2515 


4301 


6087 


784CIP2B 413 


6556 1 


f JU 


2516 


4302 


6088 


7B4CIP2B_414 


6560 


7 "31 """" 


2517 


4303 


6089 


704CIP2B_415 


6563 1 


1 J £ 


2518 


4304 


6090 


784CIP2B 416 


6564 


733 


2519 


4305 


6091 




65o7 


734 


2520 


4306 1 


6092 


784CIP2B 418 


6573 


735 


" 2521 


4307 


6093 


784CIP2B 419 


6575 


736 


2522 


4308 | 


6094 


784CIP2B_420 


6577 


737 


2523 


4309 


5095 


784CIP2B 421 


6593 


738 


2524 


4310 


6096 


784CIP2B__422 


6595 


739 


2525 


4311 


6097 


784CIP2B_423 


6599 


740 


2526 


4312 


6098 


784CIP2B 424 


6625 


741 


2527 


4313 


6099 


784CIP2B 425 


662S 
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length 
peptide 
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SkQ ID NO: 
of con tig 
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sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
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SEQ ID NO: In 
priority 
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' SEQ ID 
NO: in 
U.S. S.N. 
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2528 


4314 


6100 


784CIP2B 426 


6626 


743 


2529 


4315 


6101 


784CIP2B 427 


6630 


nTT 

l*k e t 


2530 


4316 


6102 


784CIP2B 428 


6631 





2531 


4317 


6103 


784CIP2B 429 


6632 


/4o 


2532 


4318 


6104 


784CIP2B 430 


6633 


1A1 
f%r 


2533 


4319 


6105 


784CIP2B 431 


6634 


748 


2534 


4320 


6106 


784CIP2B_432 


6638 


749 


2535 


4321 


6107 


784CIP2B 433 


*64l 


750 


2536 


4322 


6108 


784CIP2B_434 


6*44 


751 


2537 


4323 


6109 


784CIP2B 435 


6646 


752 


2538 


4324 


6110 


784CIP2B 43* 


6648 


753 


2539 


4325 


6111 


784CIP2B 437 


4*52 


754 


2540 


4326 


6112 


784CIP2B_43 8 


6*54 


755 


{ 2541 


4327 


6113 


784C1P2B 439 


6657 


756 


2542 


4328 


6114 


784CIP2B 440 


6658 


757 


2543 


4329 


6115 


7B4CIP2B_441 


6663 


758 


2544 


4330 


6116 


784CIP2B 442 


6664 


■759 


2545 


4331 


6117 


7B4CIP2B 443 


6668 


760 


2546 


4332 


6118 


784CIP2B_444 


6669 


761 


2547 


4333 


6119 


784CIP2B 445 


6673 


762 


2548 


4334 


6120 


784CIP2B_44* 


6685 


763 


2^49 


4335 


6121 


784CIP2B 447 


6687 


764 


2550 


4336 


6122 


784CIP2B 448 


6689 


765 


2551 


4337 


6123 


784CIP2B 449 


6693 


766 


2552 


4338 


6124 


784CIP2B 450 


*698 


767 


2553 


4339 


6125 


784CIP2B 451 


6699 


768 


25S4 


4340 


6126 


784CIP2B 452 


6705 


769 


2555 


4341 


6127 


784CIP2B_453 


6711 


770 


2556 


4342 


6128 


784CIP2B 454 


6713 


771 


2557 


4343 


6129 


784CIP2B 455 " 


6716 


772 


2558 


4344 


6130 


784CIP2B 456 


6725 


773 


25S9 


4345 


6131 


7B4CIP2B 457 


6726 


774 


2560 


4346 


6132 


7B4CIP2B_;458 


6121 


775 


2561 


4347 


6133 


784CIP2B_459 


6730 


776 


2562 


4348 


S134 


7B4CIP2B_460 


6730 


777 


2563 


4349 


6135 


784CIP2B_461 


6730 


778 


2564 [ 


4350 


6136 


784CIP2B 462 


6732 


773 


2565 


4351 


6137 


784CIP2B 463 ' 


6733 


780 


2566 


4352 


6138 


784CIP2B 464 


6737 


781 


2567 


4353 


S139 


784CIP2B_46S 


6745 


782 


2568 


4354 


6140 


784CIP2B_466 


6751 


783 


2569 


4355 


5141 


784CIP2B_4 67 


6754 


784 


2570 


4356 


6142 


784CIP2B_4 68 


6758 j 


7QC 

/OD 


2571 


4357 


6143 


784CIP2B 469 


6761 


1QC 
f OO 


2572 


4358 


6144 


784CIP2B 470 


6765 


f O / 


2573 


4359 


6145 


784CIP2B_471 


S768 


788 


2574 


4360 


6146 


784CIP2B 4 72 


6773 


789 




4361 


6147 


784CIP2B 473 


6776 


7<in 


Z3 /b 


4362 


6148 


784CIP2B_474 


6796 


f 71 


2577 


4363 


6149 


784CIP2B_475 


6798 


i 7QO 


. 2578 


4364 


6150 


784CIP2B_476 


*823 


/ 23 


2579 


4365 


6151 


784CIP2B_477 


6825 


I 94 


2580 


4366 


6152 | 


784CIP2B_478 


6826 


795 


2581 


4367 "'■ 


6153 




6839 


796 


2582 


4368 


6154 


784CIP2B_480 


6844 


797 


2583 


43*9 


6155 


784CIP2B_482 


6849 


798 


2584 


4370 


6156 


784CIP2B 483 


6854 | 


799 


2585 


4371 


6157 


784CIP2B 484 


"6857 


800 


2586 


4372 


615B 


784CIP2B 485 


6861 


801 


25B7 


4373 


6159 


784CIP2B 486 


6873 


802 


2586 


4374 


6160 


784CIP2B 487 


6875 


803 


" 2589 


4375 


6161 


784CIP2B 488 


6877 
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SEQ ID NO: 
of full- 
length 
nucleotide 


SEfc it) 
NO: of 
full- 
length 
peptide 
occ[ueace 


SEQ XD NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEO ID 
NO: in 
U.S. S.N. 
09/488.725 


804 


2590 




6162 


784CIP2B 489 


6880 


805 


2591 


4377 


6163 


784CIP2B 490 


6885 


806 






6164 


784CIP2B 491 


j 6890 


807 




4379 


6165 


784CIP2B 492 


6890 


808 


2594 




6166 


784CIP2B 493 


6894 


809 


' 1CQC 


4381 


6167 


784CIP2B 494 


6901 


oiu 




4362 


6168 


764CIP2B_495 


6904 


811 


/ 


4383 


6169 


784CIP2B_496 


6907 


fit? 




4384 


6170 


7B4CIP2B 497 


6914 ~ 


OX J 


2599 


4385 


6171 


784CIP2B 498 


6917 


on ■ 


2 600 


4386 


6172 


784CIP2B 499 


6923 




2601 


4387 


6173 


784CIP2B 500 


6929 


816 

'q ^ -/ 


2602 


4388 


6174 


784CIP2B Sdl 


6931 


81 / 


2603 


4389 


6175 


784CIP2B 502 


6935 


5To 

818 


2604 


4390 


6176 


784CIP2B 503 


6940 


819 


2605 


4391 


6177 


7 84CIP2B_504 


6945 


820 


2606 


4392 


6178 


784CIP2B 5"05 


6946 


B21 


! 2607 


4393 


6179 


784CIP2B 506 


6947 


822 


2608 


4394 


6180 


784CIP2B 507 


6949 


823 


2609 


4395 


6181 


784CIP2B 508 


6959 


824 


2610 


4396 - 


6182 


784CIP2B 509 


6960 


825 


2611 


4397 


6183 


784CIP2B 510 


6962 


826 


2612 


4398 


6184 


784CIP2B 511 


5963 


827 


2613 


4399 


6185 


784CIP2B 512 


6967 


828 


2614 


4400 


61B6 


784CIP2B 513 


S983 


829 


2615 


4401 


6137 


784CIP2B S14 


S988 


830 


2616 


! 4402 


6138 


784CIP2B 515 


5996 


831 


2617 


4403 


" 6139 


784CIP2B 516 


7003 


632 


2618 


4404 


6190 


784CIP2B 517 


7016 


833 


2619 


4405 


6191 


784CIP2B 518 


7017 


834 


2620 


4406 


6192 


784CIP2B 519 


7025 


83 5 


2621 


4407 


6193 


784CIP2B_520 J 


7025 


83 b 


2622 


4408 


6194 


7B4CIP2B_521 


7025 


837 


2623 


4409 j 6195 


784CIP2B_522 


7050 


838 


2624 


4410 j 6196 ' 


784CIP2B 523 


7051 


839 


2625 


4411 


6197 


784CIP2B 524 


7055 


840 


2626 


4412 


6198 


784CIP2B_S25 


7060 


841 


2627 


4413 


6199 


784CIP2B_526 


7064 


842 


2628 


4414 


6200 


784CIP2B 527 


7067 




2629 


4415 


6201 


784CIP2B 528 


7071 


844 




4416 


6202 


784CIP2B_529 


7072 




2631 


4417 


6203 


784CIP2B 530 


7073 


846 




4438 


6204 


784CIP2B_531 


707"£ 


847 


£DJJ 


4419 


6205 


784CIP2B 532 


7088 


048 




4420 


620$ | 


784CIP2B_533 


7089 


849 


2635 


4421 


6207 


784CIP2B 534 


7091 


850 




4422 


6208 


784CIP2B 535 


7091 


851 


2637 


4423 


6209 


784CIP2B 536 


7104 


852 


263 d 


4424 


6210 


784CIP2B 537 


7105 


853 


2639 


4425 


6211 


784CIP2B 53 8 


7105 


854 


£<>4U 


4426 


6212 


784CIP2B 539 


7109 


655 


2641 


4427 


62i3 


784CIP2B 540 


7109 


856 


2642 


4428 


6214 


784CIP2B_541 j 


7119 


857 


2643 


4429 


6215 




i~\ o n 
1X4 U 


658 


2644 


4430 


6216 


784CIP2B_543 


7121 


859 


2645 


4431 


6217 


784CIP2B 544 


7126 


860 


2646 


4432 


6218 


7B4CIP2B 545 


7127 


861 


2647 


4433 


6219 


784CIP2B 546 


7130 


862 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


4435 


6221 


784CIP2B_S48 


7144 


664 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 


2651 


4437" ■ 


6223 


784CIP2B 550 


7163 
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SEQ ID NO: 
of full- 
length 
nucleotide 
aequence 


SEQ "ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of con tig 

peptide 

sequence 


Priority- 
docket number_ 
corresponding 
SBQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


866 


2652 


4438 


£224 


784CIP2B 551 


7175 


867 


2653 


4439 


6225 


784CIP2B_552 


7188 


868 


2654 


4446 


6226 


784CIP2B 553 


7189 


869 


2655 


4441 


6227 


784CIP2B_554 


7190 


870 


2656 


4442 


6228 


784CIP2B 555 


7191 


871 


2657 


4443 


6229 


784CIP2B_556 


7203 


872 


2658 


4444 


6230 


784CIP2B 557 


7204 


873 


2659 


4445 


6231 


784CIP2B 558 


7208 


874 


• 2660 


4446 


6232 


784CIP2B_559 


7209 


875 


2661 


4447 


6233 


784CIP2B_S60 


7210 


876 


2662 


4448 


6234 


784CIP2B 561 


7216 


877 


2663 


4449 


6235 


784CIP2B 562 


7221 


878 


2664 


4450 


6236 


784CIP2B_563 


7230 


879 


2665 


4451 


6237 


764CIP2B 564 


7237 


880 


2^6 


4452 


6238 


784CIIP2B 565 


7240 


881 


2667 


4453 


6239 


784CIP2B_566 


7245 


882 


2668 


4 454 


6240 


784CIP2B_567 


7250 


883 


2669 


4455 


6241 


784CIP2B 56 B 


72S1 


884 


2^0 


445d 


6242 


784CIP2B 569 


7255 


885 


2671 


4457 


6243 


784CIP2B 570 


7260 


866 


2672 


4458 


6244 


784CIP2B_571 


7265 


887 


2673 


4459 


6245 


784CIP2B 572 


7268 


BBS 


2674 


" 44fJu 


$24* 


784CIP2B 573 " 


7275 


869 


2675 


4461 


6247 


784CIP2B_574 


7279 


890 


2676 


4462 


6248 


784CIP2B_575 


7283 


891 


2677 


4463 


6249 


784CIP2B 576 


7283 


892 


2678 


4464 


52S0 


7-04CIP2B £77 


7287 


893 


2679 


4465 


5251 


784CIP2B_578 


73 01 


B94 


2680 


4466 


6252 


784CIP2B_579 


73 08 


895 


2681 


4467 


6253 


784CIP2B_580 


7308 


896 


2682 


4468 


6254 


784CIP2B_581 


7309 


897 


2683 


44£d 


6255 


784CIP2B_582 


7319 


898 


2684 


4470 


6256 


784CIP2B_583 


7320 


899 


2685 


4471 


6257 


784CIP2B 584 


7326 


900 


2686 


4472 


6258 


784CIP2B_585 


7326 


901 


2687 


4473 


£2*9 


784CIP2B 586 


7334 


902 


2688 


4474 


6260 


784CIP2B 587 


7337 


903 


2689 


4475 


6261 


7B4CIP2B_588 


7339 


904 


2690 


4476 


6262 


784CIP2B_5B9 


7344 


905 


2691 


4477 


'6263 


784CIP2B_590 


7355 


90£ 


2692 


4478 


£264 


784CIP2B_59l 


7363 


907 


2693 


4479 


6265 


784CIP2B 592 


7363 


908 


.2694 


4480 


6266 


784CIP2B593 


7365 


909 


2695 


4481 


6267 


784CIP2B_594 


7368 


910 


2696 


4482 


6268 


784CIP2B_595 


7369 


911 


2697 


4483 


6269 


784CIP2B_596 


7372 


912 


2698 


4484 


6270 


784CIP2B_599 


7375 


913 


2699 


4485 


6271 


784CIP2B_600 


7381 


914 


2700 


4486 


6272 


784CIP2B_601 


7383 


91S 


2701 


4487 


6273 


784CIP2B_602 


7387 


916 


2702 


4488 


6274 


784CIP2B_603 


7391 


917 


2703 


4489 


6275 


784CIP2B_604 


7393 j 


918 


2704 


4490 


6276 


784CIP2B 605 


7395 


919 


2705 


4491 


6277 


764CIP2B 606 


7397 


920 


2706 


4492 


6278 


7B4CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B 608 


7405 i 


922 


2708 


4494 


6280 


7B4CIP2B_609 


740£ 


923 


2709 


4495 


6281 


784CIP2B 610 


7406 


924 


2710 


4496 


6282 


784CIP2B_611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4498 


6284 


784CIP2B 613 


74ll 


927 


2713 


4499 


6285 


784CIP2B„614 


7417 
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SEQ 10 NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


S2Q ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

oequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


Boa 


2714 


4500 


6286 


784CIP23_615 


741B 




*s / lb 


4501 


6287 


784CIP23 616 


7421 


Q"» A 


2716 


4502 


6288 


784CIP2B 617 


7422 


JL 


2717 


4503 


6289 


784CIP23_6l8 


7422 




2718 


4504 


6290 


784CIP2B_619 


7423 




2719 


45*05 


6291 


784CIP23 620 


7424 


934 


2720 


4506 


6292 


784CIP23 621 


7426 


935 


2721 


4507 


6293 


784CIP23 622 


7427 


936 


2722 


4509 


6294 


784CIP23_623 


7428 


937 


2723 


4509 


6295 


784CIP23 624 


7430 


938 


2724 


4510 


6296 


V04CiP23 6"2£ 


7435 


939 


2725 


4511 


6297 


784CIP2B 626 


7437 


940 


2726 


4512 


6298 


784CIP2B 627 


7439 


941 


2727 


4513 


6299 


784CIP2B 628 


f 7440 


942 


2728 


4514 


6300 


784CIP23_629 


7442 


943 


2729 


4515 


6301 


784CIP2B 630 


7450 


944 


2730 


! 4516 


6302 


784CIP23 631 


7451 


945 


2731 


4517 


6303 


784CIP2B 632 


7452 


946 


2732 


4518 


6304 


784C1P23 633 


7454 


i 94 7 


2733 


4519 


6305 


784CIP2B 634 


7457 


948 


2734 


4520 


6306 


784CIP2B 635 


7459 


949 


2735 


4521 


6307 


784CIP2B_636 


7461 


950 


2736 


4522 


6308 


784CIP2B 637 


7463 


951 


2737 


4523 


6309 


' 784CIP2B 638 


7466 


I 952 


2738 


i 4524 


6310 


784CIP2B 639 


7469 


953 


2739 


4525 


6311 


784CIP23 640 


7473 


954 


2740 


4526 


6312 


784CIP2B_641 


7481 


955 


2741 


4S27 


6313 


784CIP2B 642 


7482 


956 


2742 


4528 


6314 


784CIP2B_643 


7482 


957 


2743 


4529 


6315 


784CIP2B_644 


7483 


958 


2744 


4530 


6316 


784CIP2B_645 


7485 


959 


2745 


4531 


6317 


784CIP2B 646 


7486 


960 


2746 


4532 


6318 


784CIP2B_647 


7487 


961 


2747 


4533 


6319 


784CIP23_648 


7491 


962 


2748 


4534 


6320 


784CIP2B_649 


7492 


963 


2749 


4535 


f 6321 


784CIP2B 650 


7494 


964 


2750 


4536 


6322 


784CIP23 651 


7498 


965 


2751 


4537 


6323 


7B4CIP2B 652 


7504 


966 


2752 


4538 


6324 


784CIP23_653 


7508 


967 


2753 


453 9 


6325 


784CIP2B_654 


7516 


g_5 

SOO 


2754 


4ii40 


6326 


784CIP2B 655 


7518 


969 


2755 


4541 


6327 


784CIP2B_656 


7519 


970 


2756 


4542 


6328 


784CIP2B 657 


7521 


H fJ. 


2757 


4543 


6329 


784CIP23_6"58 


7529 




2758 


4544 


6330 


784CIP2B 659 


7532 


J f J 


2759 


4545 


6331 


784CIP23_660 


7533 


a f *m 


2760 


4546 


6332 


784CIP2B 661 


7535 




& fbl 


4547 


6333 


784CIP2B_662 


7545 


97^ 


2762 


4548 


6334 


784CIP2B 663 


7546 


3 11 


A /t>3 


4549 


6335 


784CIP2B 664 


7552 


7 / O 


2764 


4550 


6336 


784CIP2B_665 


. 7554 


j i y 




4551 


6337 


784CIP2B 666 


7567 


no ft 
90 U 


2766 


4552 


6338 


784CIP23_667 


7569 


981 


2767 


4553 






7575 


982 


2768 


4554 


6340 


784CIP23_669 


7576 


983 


2769 


45S5 


6341 


784C1P23_670 


7577 


984 


2770 


4556 


6342 


784CIP2B 671 


7579 


985 


2771 


4557 


6343 


784CIP23_6^2 


7*82 


986 


2772 


4558 


6344 


784CIP2B 673 


7567 


987 


2773 


4559 


6345 


784CIP23_674 


7589 


988 


2774 


4560 


6346 " 


784CIP2B 675 


7597 


989 


2775 


4561 


6347 


" 784dl*>2B 6-76 


7597 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
seguence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SSQ ID WO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 




2776 


4562 


6348 


784CIP2B_677 


7609 


qqi 


2777 


4563 


6349 


784CIP2B 678 


7609 




2778 


4564 


6350 • 784C1P2B 679 


7609 


001 


2 779 


4565 


6351 


784CIP2B 680 


7613 




2780 


4566 


6352 


784CIP23 681 


7623 


q ac 


2781 


4567 


6353 


784C1P23 682 


7629 


92*D 


2782 


4568 


6354 


784CIP2B 683 


7630 


997 


2783 


4569 


6355 


784C1P2B 684 


7633 


998 


2784 


4570 


6356 


784CIP2B_685 


7635 


999 


2785 


4571 


6357 


784CIP2B 686 


7638 


1000 


2786 


4572 


6358 


784CIP2B 687 


7639 


1001 


2787 


4573 


6359 


784CIP2B 688 


7646 


1002 


2788 


4574 


6360 


784CIP2B__689 


7647 


1003 


2709 


4575 


6361 


784CIP2B 690 


7648 


1004 


2790 


4576 


6362 


784CIP2B 691 


7658 j 


1005 


2791 


4577 


6363 


784CIP2B 692 


7664 


1006 


2792 


4578 


6364 


784CIP2B_693 


7664 


1007 


2793 


4579 


6365 


784CIP2B 695 


7674 ' 


1008 


2794 


4580 


6366 


784CIP2B 696 


7675 


1009 


2795 


4581 


6367 


784CIP2B 697 


7676 


1010 


2796 


4582 


5368 


784£lP2B 698 


7681 


1011 


2797 


4583 


6369 


784CIP2B_699 


768B 


1012 


2798 


4584 


6370 


784CIP2B 700 


7693 


1013 


2799 


4585 


6371 


784CIP2B_701 


7694 


1014 


2800 


4586" 


6372 


784CIP2B 702 


7715 


1015 


2801 


4587 


6373 


784CIP2B 703 


7716 


1016 


2802 


4588 


6374 


7S4CIP2B 704 


7718 


1017 


2803 


45B9 


6375 


784CIP2B_705 


7721 " 


1018 


2804 


4590 


6376 


784CIP2B 706 


7723 


1019. 


2805 


4591 


6377 


784CIP2B 707 


7729 


1020 


2806 


4592 


6378 


784CIP2B 708 


7733 


1021 


2807 


4593 


6379 


784CIP2B_709 


7735 


1022 


2808 


4594 


6380 


784CIP2B_710 


7741 


1023 


2809 


4595 


6381 


784CIP2B 711 


7743 


1024 


2810 


4596 


6382 


784CIP2B 712 


774S 


1025 


2811 


4597 


6383 


784CIP2B 713 


7749 


1026 


2812 


4598 


63 84 


784CIP2B 714 


7750 


1027 


2813 


4599 


63 85 


784CIP2B 715 


7757 


1028 


2814 


4600 


6386 


784CIP2B_716 


7759 


1029 


2815 


• 4601 


6387 


784CIP2B_717 


7760 


1030 


2816 


4602 


6388 


784CIP2B 718 


7760 


1031 


2817 


4603 


6389 


784CIP2B 719 


7764 




2818 


4604 


6390 


784CIP2B_720 


7765 | 


1 A "I 1 


2819 


4605 


6391 


784CIP2B_721 


7766 


JLu J4 


2820 


4606 


6392 


/ n »* v.. _L r ,t D t 2- £- 


7767 | 


1U JO 


2B21 


4607 


6393 


784CIP2B 723 


7769 ) 


1036 




4608 


6394 


784CIP2B 724 


7770 


1037 


ooon ~ 


4609 


6395 


7S4CIP2B 725 


7774 


1038 




4610 


6396 


7B4CIP2B 726 


7779 




Za^b . 


4611 


6397 


784CIP2BJ727 


7781 


1040 




4612 


6398 


784CIP2B728 


7782 




2827 


4613 


6399 


784CIP2B 729 


7783 


XUI^ 


2828 


4614 


6400 


784CIP2B_730 


7787 


1043 


2829 


4615 


6401 


784CIP2B 731 


7792 


1044 


2830 


461* 


6402 


784CIP2B_732 


7795 


1045 


2831 


4617 


64 03 


784CIP2B 733 


7801 


1046 


2832 


4618 


6404 


784CIP2B 734 


7807 


1047 


2B33 ~" 


4519 


640S 


784CIP29 735 


7808 


1048 


2834 


4-620 


6406 


784CIP23 736 


7819 


1049 


2835 


4621 


6407 


784CIP2B_737 


7824 


1050 


2836 


4622 


6408 


784CIP2B 738 


7826 


1051 


2837 


" 4623 


6409 


■"7B4CIP2B 739 


7829 
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SEQ ID NO: 

OJ. luii- 

nucleotide 
sequence 


SEQ ID 
NO : of 
EU1JL- 

lencrfch 
sequence 


SEQ Xt> NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1052 


2838 




(La i n 


784CIP2B_74 0 


7832 


1053 


2839 


4625 


(LAI 1 


784CIP2B 741 


7839 


1054 


2840 


4626 


6412 


/o^LlPZB 743 


7847 


" 1055" 


2841 


4627 


Oil J 


784CIP2B 744 


7848 


1056 


2842 


4628 


£4 1 A 
oQ J. 4 


784CIP2B 745 


7 853 


1<5£7 - 


2843 


4629 




784CIP2B 746 


7854 


1058 


2844 


4 630 


olio 


784CIP2B 747 


7856 


1059 


2845 




6417 


784CIP2B 748 


7862 


1060 


2846 




6418 


784CIP2B 749 


7865 


1061 


2841 


A C11 

H b J 3 


6419 


784CIP2B 750 


7874 


1062 


2848 


4o J4 


6420 


7B4CIP2B 751 


7877 


1063 




4635 


6421 


7B4CIP2B 752 


7880 


1064 




4636 


6422 


78 4CIP2B_753 


7882 


1065 


3ftR1 


4637 


6423 


784CIP2D 754 


7884 


1066 


?RC9 


463 9 


6424 


784CIP2B 755 


7886 


1067 




j 4639 


6425 


784CIP2B 756 


7888 


1068 


6(351 


4640 


6426 


784CIP2B 757 


7889 


i.069 


TQCC 

* BOO 


4641 


6427 


784CIP2B 758 


7901 j 


1070 




4642 


6428 


784CIP2B 759 


7910 


1071 


2857 


4643 


6429 


784CIP2BJ760 


7911 


1072 


2858 


4644 


643 0 


784CIP2& 76'l 


7921 


1073 


2859 


4645 


6431 


784CIP2B 762 


7923 


1074 


2860 


4646 


6432 


784CIP2B 763 


7924 j 


1075 


2861 


4647 


6433 


784CIP2B 764 


7925 


1076 


2862 


4648 


6434 


784CIP2B 7^5 


7928 


1077 


2863 


4649 


6435 


784CIP2B 766 


7929 


1078 


2864 


4650 


6436 


784CIP2B 767 


7930 


1079 


2 865 


4651 


6437 


784CIP2B 768 


7934 


1080 


2366 


4652 


643 8 


784CIP2B_769 


7938 


1081 


2367 


4653 


6439 


784CIP2B_770 


7942 


10S2 


2868 


4654 


6440 


784CIP2B 771 


7945 


10B3 


2869 


4655 


6441 


784CIP2B 772 


7946 


1094 "'" 


2870 


4656 


6442 


784CIP2B 773 


794 8 


ion 6 ; 


2871 


4657 


6443 


784CIP2B_774 


7951 


1086 


2U72 


4658 


6444 


7 84CIP2B_775 


7952 


1087 


*? a Ti 

o / J 


4659 


6445 


784CIP2B 776 j 


7953 


1088 


2 874 


4660 


6446 


7 84CIP2B_777 


7954 


1089 ""' 




4661 


6447 


784CIP2B 778 


7957 


1090 


2876 ^ 


4662 


6448 


784CIP2B 779 


7958 


1091 


2877 


4 b bJ 


5449 


784CIP2B 730 


7961 


1092 


2878 


A CCA 


6450 


784CIP2B_7Bl 


796^ 


1093 


2879 




cTci 

6451 


7B4CIP2B 782 


7966 


1094 


2880 


9 D Ob 


6452 


784CIP2B_783 


7979 


1095 


2881 


4 6 67 




784C1P2B 784 


7986 


1096" 


2882 


4668 


6454 


784CIP2B 785 


7986 


1097 


2883 


4669 


6455 


78 4CIP2B__786 


7988 


1098 


2884 I 


4670 


6456 


784CIP2B 787 


7991 


1099 


2885 


4671 


ca.cn 


784CIP2B 788 


7992 


1100" 


2866 


4672 


O 4 DO 


7B4CIP2B 789 


7992 


1101 


2887 


SO fi} 


6459 


784CIP2B 790 


7992 


1102 


2888 


4674 


6460 


784CIP2B 791 


7992 


1103 


2889 




6461 


784CIP2B 792 


8003 


1104 


" 2890 


SO / © 


64 62 


784CIP2B 793 


8014 


1105 


2091 


4677 


6463 


784CIP2B 794 


8015 


1106 


2892 


4678 


6464 


784CIP2B 795 


8016 


1107 


2893 


4o79 


6465 


784CIP2B 796 


8017 


1108 


.2894 


4680 


6466 


784CIP2B 797 


8019 


1109 


2895 


4681 


6467 


784CIP2B 798 


8020 


1110 


2896 


4682 


6463 


784CIP2B 799 


8022 


1111 


2897 


4683 


6469 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B_80l 


8028 


Till 


2899 


4665 


6471 


784CIP2B 802 j 


8030 { 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: Of 
full- 
length 
peptide 
sequence 


SEQ ZD NO; 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO:in 
U.S. S.N. 
09/488,725 


1114 


2900 


46B6 


6472 


784CIP2B 803 


j 8038 


1113 


2901 


i 4687 


6473 


784CIP2B_804 


8042 


lllO 


2902 


i 4688 


6474 


784CIP2B 605 


8045 


1117 


2903 


4689 


6475 


784CIP2B 806 


8045 


1 1 1 a 


2904 


4690 


6476 


784CIP2B 807 


8046 


ill? 


2905 


4691 


6477 


784CIP2B 808 


8047 


i 1 on 


2906 


4692 


6478 


784CIP2B 809 


8051 


1121 


2907 


4693 


6479 


784CIP2B 810 


8059 




2908 


4 694 


6480 


784CIP2B 811 


8064 


1123 


2909 


4695 


6481 


784CIP2B 812 


8069 


1124 


2910 


4696 


64B2 


784CIP2B_813 


8074 


1125 


2911 


4697 


64 83 


784CIP2B 814 


8077 


1126 


2912 


4698 


6484 


7B4CIP23 815 


8073 


1127 


2913 


4699 


6485 


7B4CIP23 816 


8079 


1128 


2914 


4700 


6466 


784CIP2B 817 


! 80B4 


1129 


2915 


4701 


6487 


784^P2B 81(J ; "8088 


1130 


2916 


4702 


6488 


784CIP2B_819 j 8090 


1131 


2917 


4703 


6489 


784C3P2B 820 


8091 


1132 


2918 


4704 


6490 


784CIP2B 821 


8099 


1133 


2919 


4705 


6491 


784CIP2B 822 


8099 


1134 


2920 


4706 


£492 


784CIP2B 823 


8100 


1135 


2921 


4707 


6493 


784C1P2B 824 


8102 


1136 


2922 


4708 


6494 


784CIP2B 825 


8103 


■1137 


2923 


4709 


6495 


784CIP2B 826 


8103 


1138 


2924 


4 710 




784CIP2B 827 


} 8104 


1139 


292S 


4711 


6491 


784CIP2B_828 


8108 


1140 


2926 


4712 


5498 


784CIP2B 329 


8110 


1141 


2927 


4713 


6499 


784CIP2B__830 


8116 


1142 


2928 


4714 


S500 


i 784CIP2B 831 " 


8117 


1143 


2929 


4715 


5501 


" 784CIP2B 832 ~ 


8123 


1144 


2930 


4716 


S502 


784CIP2B 833 


813 0 


1145 


2931 


4717 


6503 


784CIP2B 834 


8130 1 


1146 


2932 


4718 


6504 


784CIP2B_835 


8143 


1147 


2933 


4719 


6505 


784CIP2B 836 


8143 


1148 


2934 


4720 


6506 


784CIP2B_837 


8154 


1149 


2935 


4721 


6507 


784CIP2B 838 


8155 


1150 


2936 


4722 


65D8 


784CIP2B 83 9 


8162 


1151 


. 2937 


4723 


6509 


784CIP2B 840 


8163 


1152 


2938 


4 724 


6510 


784CIP2B 841 


8172 


1153 


2939 


4725 


6511 


784CIP2B__842 { 


8173 


1154 


2940 


4726 


6512 


784CIP2B_B43 


8179 




2941 


4727 


6513 


784CIP2B 844 


8182 


115 6 


2942 


4728 


6514 


784CIP2B 845 


8183 


"1 1 C *7 


2943 


4729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B 847 


8185 


1137 


2945 


4731 


6517 


784CIP2B 848 


8187 


lloV 


2946 


4732 


6518 


784CIP2B 849 


8188 


1161 


2947 


4733 


6519 


784CIP2B 8S0 


8190 


1162 


294 8 


4734 


6520 


7B4CIP2B_851 


8190 


1163 


2949 


4735 


6S21 


784CIP2B 852 


8192 




2950 


4736 r 


6522 


784CIP2B 853 


8193 


1165 


2951 


4737 


6523 


784C1P2B 854 


8197 


11 DO 


2952 


4738 


6524 


784CIP2B 855 


8197 


1167 


2953 


4739 


6525 


784CIP2B_856 


8199 


116 B 


2954 


4740 


4526' 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B 858 


8203 


1170 


2956 


4742 


6528 


784CIP2B 859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


1173 j 


29^9 


4745 


6531 


784CIP2B 862 


8214 


1174 


2960 


4746 


6532 


784CIP2B 863 


8217 


1175 1 


2961 


4747 


6533 


784CIP2B 864 


8223 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NOr 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
cor respondi ng 
SEQ ID NOt in 
priority 
application 


" sbq Ed 

NO: in 
U.S .S.N. 
09/488,725 


1176 


.2962 


4748 


6534 


784CIP2B 865 


8224 


1177 


2963 


4749 


6535 


j 784CIP2B 866 


8226 • 


1173 


2964 


4750 


6536 


784CIP2B_867 


8227 


1179 


2965 


4751 


6537 


784CIP2B_868 


8229 


11B0 


2966 


4752 


6538 


784CIP2B 869 


8232 


11B1 


2967 


4753 


6539 


784CIP2B 870 


8236 


1192 


2968 


4754 


6540 


784CIP2B 871 


8239 


1103 


2969 


4755 


6541 


784CIP2B_872 


8244 


1184 


2970 


4756 


5542 


784CIP2B_873 


8245 


11BS 


2971 


4757 


6543 


784CIP2B_874 


8248 


1136 


2 972 


4758 


6544 


784CIP2B 875 


8251 


1187 


2973 


4759 


6545 


7B4CIP2B 876 


8253 


1188 


2974 


4760 


6546 


784CIP2B 877 


8260 


1189 


! 2975 


4761 


6547 


794CIP2B_87B 


8262 


1190 


i 2976 


4762 


6548 


784CIP2B 879 


8268 


1191 


2977 


4763 


6549 


784CIP2B 880 


8270 


1192 


2978 


4764 


6550 


784CIP2B 8B1 


8272 


3193 


2979 


476~5 


6551 


784CIP2B 882 


8274 


1194 


2980 


4766 


S552 


784CIP2B 8B3 


9274 


1195 


2981 


4757 


6553 


784CIP2B 884 


8275 


1196 


2982 


4768 


| 6554 


784CIP2B 8B5 


8277 


1197 


2983 


4769 


6555 


784CIP2B 836 


8281 


1198 


2984 


4770 


6556 


784CIP2B 887 


8283 


1199 


2985 


4771 


6557 


784CIP2B 888 


8289 


1200 


2986 


4772 


6558 


784CIP2B 889 


8295 


1201 


2987 


4773 


6559 


784CIP2B B90 


8300 


1202 


2988 


4774 


6560 


784CIP2B 891 


8303 


1203 


2989 


4775 


j 6561 


784CIP2B 892 


8304 


1204 


2990 


4776 


6562 


784CIP2B 893 


8305 ' 


1205 


2991 


4777 


6563 


784CIP2B 894 


8309 


1206 


2992 


4778 


6564 


784CIP2B 895 


8318 


1207 


2993 


4779 


6565 


784CIP2B_896 


8319 


1209 


2994 


4780 


6566 


784CIP2B 897 


8321 


1209 


299S 


4781 


6567 


784CIP2B_898 


8322 


1210 


2996 


4782 


6568 


764CIP2B_899 


8323 


1211 


2997 


4783 


6S69 


784CIP2B 900 


8325 


1212 


2998 


4784 


6570 


7B4CIP2B_901 


8331 


1213 


2S99 


4785 


6571 


784CIP2B_902 


8332 


1214 


3000 


4786 


6572 


784CIP2B 903 


8333 


1215 


3001 


4787 


6573 


784CIP2B 904 


8335 


1216 


3002 


4788 


6574 


784CIP2B 905 


8336 


1217 


3003 


4789 


6575 


784CIP2B_905 


8337 


1218 


3004 


4790 


6576 


784CIP2B_907 


8340 


1219 


3005 


4791 


6577 


784CIP2B 90B 


8343 


1220 


3006 


4792 


6578 


784CIP2B 909 


8347 


1221 


3007 


4793 


6579 


784CIP2B 910 


8349 


1222 


300B 


4794 


6560 


784CIP2B_911 


9351 


1223 


3009 


4795 


6581 


784CIP2B_912 


8353 


1224 


3010 


4796 


6562 


784CIP2B_913 


B355 


1225 


3011 


4797 


6583 


784CIP2B_914 


8361 


1226 


3012 


4798 


6584 


784CIP2B_915 


8365 ~ 


1227 


3013 


4799 


6585 


784CIP2B_916 


8367 


1228 


3014 


4800 


6586 


784CIP2B_917 


8369 


1229 


3015 


4801 


6587 


7B4CIP2B_919 


8375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4 803 


6589 


7B4CIP2B_921 


8391 


1232 


3018 


4804 


6590 


784CIP2B 922 


8393 


123 3 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


8394 j 


1235 


3021 


4807 


6593 


784CIP2B_925 


8395 


123* 


3022 


4808 


6594 


784CIP2B_92 6 


8396 


1237 


3 623 ' 


4809 


6595 


784CIP2B 927 


8398 
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SEQ tb NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
seguence 


| SEQ ID 

NO: 

of contig 

peptide 

sequence 


j Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S.S.M. 
09/488,725 


loin 


3024 


4810 


6596 


784CIP2B_928 


8402 


1235 


3025 


4011 


6597 


784CIP2B 929 


8402 


1240 


3026 


4B12 


6598 


?84ciP2B_930 


8405 


1241 


3 027 


4813 


6595 


784CIP2B 931 


I 8405 


1242 


3028 


4814 


6600 


784CIP2B_932 


8409 


1243 


3029 


4615 


6601 


j 784CIP2B 933 


8410 


1244 


3030 


4816 


6602 


784CIP2B 934 


8414 . 


1245 

1 1A ^ 


3031 


4817 


6603 


784CIP2B 935 


8415 


1246 


3032 


4818 


6604 


784CIP2B 936 


8419 


1247 


3033 


4819 


6605 


784CIP2B 937 


8426 


1248 


3034 


4820 


6606 


784CIP2B 938 


8430 


1249 


3035 


4821 


6607 


7B4CIP2B 939 


8431 


1250 


3036 


4 322 


6608 


784CIP2B 940 


8432 


1251 


3037 


4 823 


6609 


784CIP2B 941 


8433 


1252 


3038 


4 824 


6610 


784CIP2B 942 


8434 


1253 


3039 


4 825 


6611 


784CIP2B 943 


8438 


1254 


3040 


4826 


6612 


784CIP2B 944 


8439 


12S5 


3041 


4827 


6613 


784CIP2B_945 


8441 


1256 


3042 


4828 


6614 


784CIP2B 946 


8450 


1257 


3043 


4829 


6615 


784CIP2B_94 7 


8451 


1258 


3044 


4830 


6616 


784CIP2B_94 8 


8452 


1259 


3045 


4831 


6617 


784CIP2B 949 


6460 


1260 


3046 


4832 


5618 


784CIP2B 950 


8461 


1251 


3047 


48*3 


6619 


784CIP2B 951 


8462 


1262 


3048 


4834 


6620 


794CIP2B_952 


8464 ' 


1263 


3049 


! 4835 


6621 


784CIP2B 953 


8465 


1264 


3050 


4836 


6622 


794CIP2B_954 


8467 


1265 


3051 


4H37 


£623 


784CIP2B 955 


8470 


1266 


305^2 


4838 


6624 


784CIP2B_956 


8471 


1267 


3053 


4839 


6625 


784CIP2B 957 


8473 


1268 


3054 


4840 


6626 


784CIP2B_958 


8474 


1269 


3055 


4841 


6627 


784CIP2B 959 


8475 


1270 


3056 


4B42 


6628 


784CIP2B_960 


8476 


1271 


3057 


4B43 


6629 


784CIP2B_961 


8480 


1272 


3058 


4344 


6630 


784CIP2B 962 


8482 


1273 


3059 


4845 


6531 


784CIP2B_963 


8482 


1274 


3060 


4846 


6632 


784CIP2B_964 


8486 


1275 


3061 


4847 


6633 


784CIP2B_965 


8408 


1276 


3062 


4848 


6634 


784CIP2B 966 


8492 


1277 


3063 


4849 


6635 


784CIP2B_967 


8494 


1278 


3064 


4850 


6636 


784CIP2B 968 


8496* 


1279 


3065 


4851 


6537 


784CIP2B 969 


6497 


1280 • 


3066 


4852 


6.638 


784CIP2B 970 


8499 


1281 


3067 


4853 


6635 


784CIP2B_971 


8513 


12 62 


3068 


4854 


6640 


784CIP2B 972 


8522 j 


1283 


3069 


4855 


6641 


784CIP2B 973 


8526 




3070 


4856 


6642 


784CIP2B 974 | 


8531 


1285 


3071 


4857 


6643 


784CIP2B 975 


8533 




3072 


4858 


6644 


734CIP2B 976 


8542 


12 87 


307^3 


4859 


6645 


784CIP2B 977 


8544 


1288 


3074 


4860 


6646 


784CIP2B 978 


8555 


1289 


3075 


4651 


6647 


784CIP2B 979 


8565 


1290 


3076 


4062 


6648 


784CIP2B 980 


8572 


1291 


3077" " _ 


4863 




/H4L.1P2B 981 


8576 


1292 


3078 


4864 


6650 


784CIP2B_982 


8578 


1293 


3079 


4865 


6651 


784CIP2B 983 


8584 


1294 


3080 


4866 


6652 


784CIP2B 984 


8598 | 


1295 


3081 


4867 


6653 


784CIP2B_985 


8602 


1296 


3082 


4868 


6654 


784CIP2B 986 


8604 


1297 


3083 


4869 


6655 


7B4CIP2B 987 


8609 


1298 


3084 


4870 


6656 


784CIP2B 98B 


8612 


1295 


3085 


4871 ' 


665V 


784CIP2B 989 


8637 



291 



WO 01/53312 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


S2Q ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO; in 
U.S. S.N. 
09/488,725 


"xTffij 


3086 


4872 


6658 


704CIP2B_990 


8640 


1301 


3087 


4873 


6659 


784CIP2B 991 


i 8643 


1302 


JUOO 


4874 


6660 


784CIP2B 992 


864* 


13 03 


3089 


4875 


6661 


784CIP2B 993 


8650 


13fi4 


3090 


4876 


6662 


784CIP2B 994 


8651 


1305 




' a cH-7 

4B 11 


6663 


784CIP2B_995 


8654 






4 878 


6664 


784CIP2B 996 


8655 




TTTq 7 * 


4879 


6665 


784CIP2B 997 


8657 "~i 


1308 


J U »ft 


4880 


6666 


784CIP2B 998 


8665 


i mo 


3095 


4881 


6667 


784CIP2B 999 


8668 


UiU 


3096 


4882 


6668 


784CIP2B 1000 


8671 


■Ljij. 


3097 


4883 


6669 


784CIP2B 1001 


8672 




3098 


4884 


6670 


784CIP2B 1002 


8692 


1313 


3099 


4885 


6671 


784CIP23 1003 


8706 


1314 


3100 


4886 


6672 


7B4CIP23 1004 


8716 


1315 
me 


3101 


4B87 


6673 


784CIP2B 1005 


8719 


i Jib 


3102 


4888 


6674 


7B4CIP2B 1006 


8743 


1317 


3103 


4889 


6675 


784CIP2B 1007 


8764 


1313 


3104 


4890 


6676 


7B4CIP2B 1008 


8764 


1319 


3105 


4891 


6477 


784CIP2B 1009 


8764 


1320 


3106 


4892 


667B 


784CIP2B 1010 


8774 


1321 


3107 


4893 


6679 


784CIP2B 1011 


8782 


1322 


3108 


4894 


6680 


784CIP2B 1012 


8796 


1323 


3109 


4895 


6681 


784CIP2B_1013 


8827 


1324 


3110 


4896 


i 6682 


784CIP2B 1014 


8842 


1325 


3111 


4897 


6603 


764CIP2B 1015 


8842 


1326 


3112 


4898 


6684 


784CIP2B 1016 


8858 


1327 


3113 


4899 


6685 


784CIP2B 1017 


8871 


1328 


3114 


4900 


6686 


784CIP2B 1018 


"8921 


1329 


3115 


4901 


6687 


7B4CIP2B 1019 


8927 


133 0 


3116 


4902 


6688 


784CIP2B 1020 


8942 


1331 


3117 


4903 


66B9 


784CIP2B 1021 


8994 


1332 


3110 


4904 


6690 


784CIP2B 1022 


9023 


1333 


3119 


4905 


6691 


784CIP2B 1023 


9028 


1334 


3120 


4906 


6692 


784CIP2B 1024 


9058 


1335 


3121 


4907 


6693 


784CIP2B 1025 


9058 


1336 


3122 


4908 


6694 


784CIP2B 1026 


9079 


1337 


3123 


4909 


6695 


784CIP2B 1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B 1028 


9082 




3125 


4911 


6697 


784CIP2B 1029 


9084 | 


1 lift 


3126 


4912 


6698 


784CIP2B 1030 


9093 


X J *i X 


3127 


4913 


6699 


784CIP2B 1031 


9101 


1342 


3126 


4914 


6700 


784CIP2B 1032 


9103 


1343 


3129 


4915 


6701 


784CIP2B_1033 


9105 


1344 


3130 

— L 


49Z6 


6702 


784CIP2B 1034 


9151 


1345 


31J I 


4917 


6703 


784CIP2B 1035 


9161 


1346 




4918 


6704 


704CIP2B 1036 


9172 


1347 




4919 i 


6705 


784CIP2B 1037 


9174 


1348 




4920 


6706 


784CIP2B 1038 


9204 


1349 




4921 


6707 


784CIP2B 1039 


9234 


1350 


J J. J b 


4922 


6708 


784CIP2B 1040 


9235 


1351 


Jlj f 


4923 


6709 


784CIP2B_1041 


9239 i 


■ 1352 




4924 


6710 


784CIP2B_1042 


9256 


1353 


" 3139 


4925 


6711 




9276 


1354 


3140 


4926 


6712 


784CIP2B 1044 


9345 


1355 


3141 


4927 


6713 


784CIP2B 1045 


9379 


1356 


3142 


4926 


6714 


7B4CIP2B 1046 


9435 '] 


1357 


~ 3143 


4929 


6715 


7B4CIP2B 1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


- 314* 


4931 


6717 


784£lP2B 1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B 1050 


9502 


1361 


3147 


4 933 


6719 


784CI22B 1051 


9520 
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SEQ ZD NO: 
of full- 
length 
nucleotide 


SEQ ID 
WO: of 
full- 
length 
peptide 
secjusnce 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID WO: in 
priority 
application 


SEQ ID 
WO: in 
V. S.S.N. 
09/488,725 


1362 


314 8 ~ 




6720 


784CIP2B_1052 


9541 


1363 


O X** J 


4935 


6721 


784CIP2BJL053 


9541 


1364 


■> -LOU 


4936 


6722 


784CIP2B_1054 


9548 


1365 


11C1 


4937 


6723 


784CIP2B_1055 


9556 " 





5T52 


4938 


6724 


7B4CIP2B 1056 


9556 


1367 




4939 


6725 


784CIP2B 1057 


9575 


1368 


1 CA 


4940 


6726 


784CIP2B 1058 


9589 


1 q f q 


J 133 


4941 


6727 


784CIP2B 1059 


9599 


TT7n 


11CC 


4942 


6728 


784CIP2B 1060 


9602 


TT71 


3 157 


4943 


6729 


784CIP2B 1061 


9606 


Tv75 


3 158 


4944 


6730 


784CIP2B 1062 


9622 




3159 


4945 


6731 


764CIP2B 1063 


9623 


tTTt/a 


3160 


4946 


6732 


784CIP2B 1064 


9646 


1375 


3161 


4947 


6733 


7B4CIP2B 1065 


9747 


1376 


3152 


4948 


6734 


784CIP2B 1066 


9773 


1377 


3163 


4949 


6735 


7B4CIP2B 106/ 


9785" 


1378 


3164 


4950 


£736 


7B4CIP2B 106B 


9801 


1379 


3165 


4951 


6737 


784CIP2B_1069 


9811 


13B0 


3166 


4952 


673 8 


784CIP2B 1070 


9843 


13 81 


3167 


'4953 


6739 


784CIP2B 1071 


9854 


1382 


3168 


4954 


6740 


784CIP2B 1072 


9854 


13 83 


3169 


4955 


6741 


784CIP2B 1073 


5864 


1384 


3170 


4956 


6742 


784CIP2B 1074' 


9864 


1385 


3171 


4957 


6743 


784CIP2B 1075 


9871 


1386 


3172 


4958 


6744 


784CIP2B 1076 


9879 


1387 


3173 


49^9 


6745 


784CIP2B 1077" 


9881 


1300 


3174 


4960 


6746 


784CIP2B 1078 


9885 


1389 


3175 


4961 


6747 


784CIP2B 1079 


9901 


1390 


3176 


4962 


6748 


784CIP2B 1080 


9912 


1391 


3177 


4 963 


6749 


784CIP2B 1081 


9916 


1392 


3 178 


4964 


6750 


784CIP2B 1082 


9921 


1393 


3179 


4965 


6751 


784CIP2B 1083 


9925 


1394 


3180 


4966 


6752 


7B4CIP2B 1084" 


9930 


1395 


3181 


4967 


6753 


784CrP2B 1085 


9949 


1396 


3182 


4968 


6754 


784GIP23 1086 


9951 


TVS^ 

1397 


3183 


4969 


6755 


784CIP2B 1087 


9959 


1398 


3184 


4970 


6756 


764CIP2B 1088 


9973 




3185 


4971 


6757 


7B4CIP2B 1089 


9982 


1400 


3186 


4972 


6758 


784CIP2B 1090 


9994 


1 A A1 
14U1 


3187 


4973 


6759 


784CIP2B 1091 


10021 


1402 


3188 


4974 


6760 


784CIP2B 1092 


10041 


1403 




4975 


6761 


784CIP2B 1094 


10067 


1404 


linn 


4976 


67S2 


784CIP2B 1095 


10073 


1405 


n 01 


4977 


6763 


784CIP2B 1096 


10X12 


1406 


-3 1^4 


4978 


6764 


784C1P2B 1097 


10117 


1407 


-3 -L-7 J 


4979 


6765 


784CIP2B 1098 


10132 


1408 




4980 


6766 


7B4CIP2B 1099 


10169 


1409 


3195 


4 981 


6767 I 


784CIP2B 1100 


10217 


1410 


3196 


4982 


6768 


7B4CIP2B 1101 


10226 


1411 


3197 


4983 


6769 


784CXP2B 1102 


10232 


1412 


3198 


4984 


6770 


784CIP2B 1103 


10237 


1413 




4985 


6771 


784C1P2B 1104 


10279 


1414 


3200 


4986 


6772 


7B4CIP2C 1 


33 


1415 


3201 


4987 


6773 


784PTPPP ? 


271 


1416 


3202 


4988 


6774 


7 84CIP2C 3 


848 


1417 


3203 


4989 


6775 


784CIP2C 4 


849 


1418 


3204 


4990 


6776 


784CIP2C 5 


664 


1419 


3205 


4991 


6777 ! 


784CIP2C 6 


953 


1420 


3206 


4992 


" 6778 ■"■ 


784CIP2C_7 


980 


1421 


-~32d7 


4993 


6779 


784CIP2C 8 


1595 


1422 


3206 


4994 


6780 


784CIP2C 9 


1697 


1423 


3209 


4995 


6781 


784CIP2CJL0 


" 1744 
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SEQ ID KO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
oE contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


" SEQ ID 
NO:in 
U.S. S.N. 
09/488,725 




-J210 


4996 


6782 


784CIP2C 11 


1937 




3211 


4997 


6783 


784CIP2C 12 


1955 


XI <£0 


3212 


4998 


6784 


784CIP2C 13 


1955 


"1 AO*7 
14Z / 


3213 


4999 


6785 


784CIP2C 14 


2185 


" TAon 


3214 


5000 


6786 


784CIP2C_15 


2889 





3215 


5001 


67B7 


784CIP2C 16 


2901 


143 0 


3216 


5002 


6768 


784CIP2C 17 


2902 1 


143 1 


3217 


5003 


678 9 


784CIP2C 18 


2905 


1432 


3218 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 


5005 


6791 


784CIP2C 20 


2956 


1434 


3220 


5006 


6792 


784CIP2C 21 


2959 


1435 


3221 


5007 


6793 


7B4CtP2CJ22 


2965 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


7B4CI?2C_24 


2970 


1438 


3224 


5010 


6796 


784CIP2C 25 


2985 


1439 


3225 


5011 


6797 


784CIP2C_26 


2987 


1440 


3226 


5012 


6798 


784CIP2C 27 


2993 


1441 


3227 


5013 


6799 


7B4CIP2C 28 


2993 


1442 


3228 


5014 


6800 


784CIP2C 29 


3017 


1443 


3229 


5015 


6801 


784CIP2C 30 


3046 


1444 


3230 


5016 


6802 


784CIP2C 31 


3050 


1445 


3231 


5017 


5803 


784CIP2C 32 


3357 


1446 


3232 


501B 


6804 


784CIP2C 33 


3359 


1447 


3233 


5019 


6805 


7B4CIP2C_34 


3432 


1448 


3234 


5020 


6806 


784CIP2C 35 


3438 


1449 


323S 


5021 


6807 


784CIP2C_36 


3439 


1450 


3236 


5022 


6808 


784CIP2C 39 


3463 


1451 


3237 


5023 


6809 


784CIP2C 40 


3466 


1452 


3238 


5024 


6310 


784CIP2C_41 


3466 


1453 


3239 


5025 


6911 


784CIP2C_42 


3467 


1454 


3240 


5026 


6312 


784CIP2C_43 


3468 


1455 


3241 


5027 


6813 


784CIP2C_44 


3483 


1456 


3242 


5028 


6814 


784CIP2C_45 


3484 


1457 


3243 


5029 


6815 


784CIP2C 46 


3488 


1458 


3244 


5030 


6816 


784CIP2C 47 


3491 


1459 


3245 


5031 


6817 


784CIP2C 48 


3493 


1460 


3246 


5032 


6818 


784CIP2C_49 


3494 


1461 


3247 


5033 


6819 


784CIP2C 50 


3495 


1462 


3248 


5034 


€820 


784CIP2C 51 


3496 


1463 


3249 


5035 


6821 


784CIP2C 52 


3503 


14 64 


3250 


5036 


6822 


784CIP2C 53 


3503 


1465 


3251 


5037 


6823 


784CIP2C 54 


3504 


— -TA"gVf 

x4bb 


3252 


5038 


6824 


7B4CIP2C 55 


3511 


HO / 


3253 


5039 


6825 


784CIP2C_5.6 


3531 


-I4CA 


3254 


5040 


6826 


784CIP2C_57 


3536 


~1 ACQ 
Xfi OiJ 


3255 


5041 


6827 


784CIP2C 58 


3546* ! 


X^i / U 


Jzbo 


5042 


6828 


784CIP2C_59 


3548 


X» / 0. 


3257 


5043 


6829 


7 84CIP2C_60 


3551 


1472 




5044 


6830 


784CIP2C 61 


3553 


1473 


3259 


5045 


6831 


784CIP2C 62 


3564 


1 AHA 


3260 


5046 


6832 


784CIP2C 63 


3567 


X% / D 


3261 


5047 


6833 


784CIP2C_64 


3572 


X^ /u 


3262 


5048 I 


6834 


784CIP2C 65 


3573 


1477 


3263 


5049 


6835 


/o4(.xF2C oo 


3574 


1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


1479 


3265 


£051 


6837 


784CIP2C_68 


3615 


1460 


3266 


5052 


6838 


784CIP2C 69 


3623 


1481 


3267 


5053 


6839 


784CIP2C 70" 


3629 


1482 


3268 


" 5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6841 


784CIP2C 72 


366^ 


14 84 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 


5057 


6843 


784CIP2C 74 


3912 
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SE!Q lb NO:" 
Of full- 
length 
nucleotide 
sequence 


" SEQ ID- 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of con tig 

peptide 

sequence 


Priority- 
docket number^ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO:in 
U.S. S.N. 
09/488,725 


1486 


3272 


5058 


6844 


7B4C:P2CJ75 


3924 


14 87 


3273 


5059 


| 684S 


784CIP2C_7£ 


3928 


1488 


3274 


5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


7B4CIP2C 78 


3959 


1490 


3276 


5062 


6848 


784CIP2C 79 


3981 


1491 


3277 


5063 


6849 


784CIP2C 80" 


3989 


1492 


3273 


506*4 


6850 


784CIP2C_81 


4295 


1493 


3279 


506S 


6851 


7B4CIP2C_82 


4300 


1494 


32B0 


5066 


6852 


784CIP2C_83 


4360 


1495 


3281 


5067 


" 6853 


784CIP2C_84 


4362 


149* 


3282 


5068 


6854 


784CIP2C_8S 


| 4371 


• 1497 


3283 


5069 


6855 


784CIP2C 86 


i 4373 


1498 


3284 


5070 


6856 


784CIP2C 87 


4376 


1499 


3285 


5071 


6857 


784CiP2C 89 


4378 


1500 


3286 


5072 


6858 


784CIP2C_90 


4382 


1501 


3287 


5073 


6859 


784CIP2C_91 


4409 


- 1502 


3288 


5074 


5860 


784CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C 93 


4421 


1504 


3290 


5076 


6862 


784CIP2C 94 


f 4426 


1505 ' 


3291 


5077 


6863 


784CIP2C 95 


4430 


1506 


3292 


507B 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6865 


784CIP2C_97 


4436 


1508 


3294 


£680 


6866 


784CIP2C_98 


4439 


1509 


3295 


5081 


6867 


784CIP2C_99 


4440 


1S10 


3296 


5082 


6868 


784CIP2C_100 


4441 


1511 


3297 


5083 


6869 


784CIP2C_101 


4442 


1512 


3298 


S0B4 


6870 


784CIP2C_102 


4455 


1513 


3299 


5085 


6971 


784CIP2C_103 


4462 


1S14 


3300 


5096 


6872 


784CIP2C 104 


4466 


1515 


3301 


5087 


6873 


784CIP2C 105 


4469 


1516 


3302 


5098 


6374 


784CIP2C 10* 


4477 


1517 


3303 


5089 


6875 


784CIP2C 107 


' 44 81 


1S18 


3304 


5090 


6076 


784CIP2C_108 


4483 


1519 


3305 


5091 


6877 


784CIP2C 109 


4484 


1520 


3306 


5092 


6878 


784CIP2C 110 


4486 


1521 


3307 


5093 


6879 


784CIP2CJ.il 


4490 


1522 


3308 


5094 


6880 


784CIP2C_112 


4499 


1523 


3309 


5095 


6B81 


784CIP2C_113 


4503 


1524 


3310 


5096 


6 882 


784CIP2C 114 


4506 


1525 


3311 


5^097 


6383 


784CIP2C 115 


4509 


1526 


3312 


5098 


6884 


784CIP2C 116 


4514 


1527 


3313 


5099 


6885 


784CIP2C 117 


4516 


1528 


3314 


5100 


6386 


784CIP2C 118 


4522 


1529 


3315 


5101 


6887 


784CIP2C_119 j 


4525 


1530 


3316 


5102 


6888 


784CIP2C 120 


4527 ^ 


1531 


3317 


5103 


6889 


784CIP2C 121 


4528 


1532 


3318 


5104 


6890 


784CIP2C 122 


4529 


1533 


3319 


5105 


6891 


784CIP2C_123 


4532 


1534 


3320 


5106 


6892 


784CIP2C 124 


4537 


1535 


3321 


5107 


6893 


784CIP2C 125 


4538 


1536 


3322 


5108 


6894 


784CIP2CJL26 


4551 


1537 


3323 


5109 


6895 


784CIP2C 127 


4552 


1538 


3324 


5110 


6896 


784CIP2C 128 


4559 


1539 


3325 


5111 


6897 


734CIP2C_129 


4567 


1540 


3326 


5112 


6898 


784CIP2C_130 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4585 


1542 


3328 


5114 


6900 


784CIP2C 133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 


4609 


1544 


3330 [ 


5116 


6902 


784CIP2C 135 


4616 


1545 


3331 


5117 


6903 


784CIP2C 136 


4617 


1546 


3332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 "" 


5119 


6905 


784CIP2C 138 


4620 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ iD " 
NO: of 
full- 
length 
peptide 
sequence 


S5JQ ID NO: 
of contig 
nucleotida 
sequence 


_ SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SBQ ID NO: in 
priority 
application 


"~SBtf ID 

NO: in 
U.S. S.N. 
09/488,725 


1548 


3334 


5120 


69Q6 


784CIP2C 139 


4624 


1549 


3335 


5121 


5907 


784CIP2C 140 


4632 


1550 


3336 


5122 


6908 


784CIP2C 141 


4634 


1551 


3337 


5123 


6909 


7B4CIP2C_142 


4638 


1552 


3338 


5124 


6910 


784CIP2C 143 


4639 


1553 


3339 


5125 


6911 


784CIP2C 144 


4643 


1554 


3340 


5*126" 


6912 


784CIP2C_145 


4644 


1555 


3341 


5127 


6913 


784CIP2C_146 


4655 


1556 


| 3342 


5128 


6914 


784CIP2C 147 


! 4668 


1557 


3343 


5129 


691S 


784CIP2C 148 


4677 


1558 


3344 


£130 


6916 


784CIP2CJL49 


4677 


1559 


3345 


. 5131 


6917 


784CIP2CJL50 


4677 


1560 


3346 


5132 


6918 


784CIP2CJL52 


46B2 


1561 


3347 


5133 


S919 


784CIP2C_153 


4690 


1562 


3348 


5134 


6920 


784CIP2CJL54 


4691 


1563 


3349 


5135 


6921 


784CIP2C__155 


4727 


1564 


3350 


5136 


6922 


784CIP2C_156 


4730 


1565 


3351 


5137 


6923 


784CIP2CJL57 


4734 


1566 


3352 


5138 


6924 


784CIP2CJL58 


4757 


1567 


3353 


5139 


6925 


784CIP2C 159 


4764 


1568 


3354 


5140 


6926 


784CIP2C_160 


4786 


1569 


3355 


5141 


6927 


784CIP2C_161 


4793 


1570 


3356 


5142 


6928 


784CIP2C 162 


4825 


15^71 


3357 


5143 


6929 


784CIP2C 163 


4826 


1572 


3358 


5144 


6930 


784CIP2CJL54 


4850 


1573 


3359 


5145 


6931 


784CIP2C__155 


4853 


1574 


3360 


5146 


6932 


784CIP2C 166 


4855 


157S 


3361 


5147 


6933 


784CIP2C 167 


4856 


1576 


3362 


5148 


6934 


7B4CIP2C_168 


4867 


1577 


3363 


5149 


6935 


784CIP2C_169 


4869 


1578 


3364 


5150 


6936 


7B4CIP2C_170 


" 4878 


1579 


3365 


5151 


1 6937 


784CIP2C 171 


4880 


1580 


3366 


5152 


6938 


7B4CIP2C_172 


4942 


1581 


3367 


5153 


6939 


784CIP2C_173 


4945 


1S82 


3368 


5154 


6940 


784CIP2C 174 


4950 


1583 


3369 


S15S 


•6941 


784CIP2CJL75- 


4952 


1584 


3370 


5156 


6942 


784CIP2C_176 


4954 


1585 


3371 


5157 


6943 


784CIP2C 177 


4958 


1586 


3372 


5158 


6944 


784CIP2C__178 


4961 


1587 


3373 


5159 


6945 


784CIP2C 179 


5590 


15b8 


3374 


5160 


6946 


7B4C1P2C 180 


SS99 


1589 


3375 


5161 


6947 


784CIP2C 181 


5692 


1590 


3376 


5162 


6948 


784CIP2C 182 


5732 | 


1591 


3377 


5163 


6949 


784CIP2C__183 


5765 


1592 


3378 


5164 


6950 


784CIP2C_184 


5771 


1593 


33 79 


5165 


6951 


784CIP2C 185 


5774 ~~i 


1594 


33 80 


5166 | 


6952 


784CIP2C_186 


5793 


1595 


33 81 


5167 


6953 


784CIP2C 187 


5806 


1596 


3382 


51€"B 


6954 


784CIP2C_188 


5852 


1597 


33 83 


5169 


6955 


784CIP2C 189 


5892 


1598 


33B4 


5170 


6956 


784CIP2C 190 


6057 


1599 


3385 


5171 


6957 


784CIP2C_191 


6061 


1600 


3386 


5172 


6958 


7B4CIP2C_192 


6ld9 


1601 


33 87 


5173 


6959 


784CIP2C 193 


6160 i 


1602 


3368 


5174 


6950 


784CIP2C 194 


5297 


1603 


3389 


5175 


6961 


784CIP2C_195 


6398 


1604' 


'3390 


5176 


6952 


784CIP2C 196 


6-398 


1605 


3391 


5177 


6963 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


784CIP2C_198 


6446 


1607 


3393 


5179 


6965 


784C1P2C 199 


6469 


1606 


3394 


5180 


6966 


784CIP2C 200 


ait 


1609 


3395 


5181 


6967 


784CIP2C 201 


6561 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO; of 

full- 
length 
peptide 
sequence 


' SEQ ID NO: 
of contig 
nucleotide 
sequence 


seq ro 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID KO: in 
priority- 
application 


""SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1610 


3396 


5182 


6968 


784dP2C_202 


6574 


1611 


3397 


5183 


6969 


784CIP2C_203 


6578 


1612 


3346 


SlB4 


6970 


784CIP2C 204 


6662 


1613 


3399 


5185 


6971 


784CIP2C 205 


6672 


1614 


3400 


5186 


6972 


784CIP2C_206 


6691 


1615 


3401 


5187 


6973 


784CIP2C 207 




1616 


3402 


5188 


6974 


784CIP2C 208 


6746 


1617 


3403 


5189 


6975 


784CIP2C_209 


6898 


1618 


3404 


5190 


6976 


784CIP2C_210 


6936 


1619 


3405 


5191 


6977 


7B4CIP2C_211 


6943 


1620 


3406" 


5192 


6978 


7B4CIP2C_212 


7110 


1621 


3407 


5193 


6979 


784CIP2C 213 


7200 


1622 


3408 


5194 


6980 


784CIP2C 214 


7212 


1623 


3409 


5195 


6981 


7B4CIP2C 215 


721H 


1624 


341b ' 


' 5196 


6982 


784CIP2C 216 


7249 


1625 


3411 


5197 


6983 


784CIP2C_217 


7500 


1626 


3412 


5198 


6984 


784CIP2C 218 


7509 


1627 


3413 


5199 


6985 


784CIP2C__219 


7523 


1628 


3414 




6986 


784CIP2C 220 " 


7544 


1629 


3415 


5201 


S987 


, 784CIP2C_221 


7564 


1630 


3416 


5202 


6988 


784CIP2C_222 


7568 


1631 


3417 


5203 


6989 


7B4CIP2C_223 


7631 


1632 


3418 


" 5204 


6990 


784CIP2C 224 


7813 


1633 


3419 


5205 ' 


6991 


784CIP2C 225 


7831 


1634 


3420 


5206 


6992 


784CIP2C 226 


7843 


1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5206 


6994 


784CIP2C_228 


7943 


1637 


3423 


5209 


6995 


784CIP2C_229 


8175 


1638 


3424 


5210 


6996 ^ 


784CIP2C 230 


8216 


1639 


3425 


5211 


6997 


784CIP2C_231 


8225 


1640 


3426 


5212 


699B 


784CIP2C_232 


8271 


1641 


3427 


5213 


6999 


784CIP2C_233 


8397 


1642 


3428 


5214 


7000 


784CIP2d 234 


3466- 


1643 


3429 


5215 


7001 


784CIP2C_235 


8503 


1644 


3430 


5216 


7002 


784CIP2C_236 


8953 


1645 


3431 


5217 


7003 


784CIP2C_237 


9106 


1646 


3432 


5218 


7004 


7B4CIP2C 238 


9139 


1647 


3433 


5219 


7005 


784CIP2C 239 


95SS 


1648 


3434 


5220 


7006 


784CIP2C 240 


9650 


1649 


3435 


5221 


7007 


784CIP2C_241 


9889 


1650 


3436 


5222 


7008 


784CIP2C_242 


9933 


1651 


3437 


5223 


7009 


784CIP2C 243 


9953 


1652 


3438 


5224 


7010 


784CIP2C_244 


9981 


1653 


3439 


5225 


7011 


784C3P2D 1 


746 


1654 


3440 


5226 


7012 


784CIP2D 2 


3558 


1655 


3441 


5227 


7013 


784CIP2D_3 


3558 


1656 


3442 


5228 


7014 


784CIP2D_4 


3633 


1657 


3443 


5229 


7015 


784C1P2D 5 


3658 


1658 


3444 


523 0 


7016 


784CIP2D 6 


37^ 


1659 


3445 


5231 


7017 


784CIP2D 7 


4004 


1660 


3446 


5232 


7018 


784CIP2D 8 


4700 


1661 


3447 


5233 


7019 


784CIP2D_9 


4703 


1662 


3448 


5234 


7020 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


784CIP2D_11 


4894 


1664 


3450 


- 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D 13 


5159 


1666 


3452 


5238 


7024 


784CIP2D_14 


" 7443 


1667 


3453 


5239 


7025 


784CIP2D__15 


8673 


1668 


3454 


5240 


7026 


784CIP2D 16 


8679 


1669 


3455 


5241 


702 7 


784dP2D_17 


8727 


1670 


3456 


5242 


7028 


784CIP2D 18 


8734 


1671 


345"* 


5243 


7029 


784CIP2D_19 


8756 
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SEQ ID NO: 
of full- 
length 


SEQ ID 
NO: of 
full - 
length 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket nutnber_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NOtin 
U.S. S.N. 
09/488,725 


1672 


34£>8 




7030 


784CIP2D 20 


8818 


1673 


3459 




7031 


784CIP2D 21 


8844 


1674 


3460 


<?4*f 


/UJZ 


784CIP2D 22 


8846 


1675 


3461 


5247 


*7 fi ■» r» 


784CIP2D 2j 


8912 


1676 


3462 


524 A 


/UJ4 


784CIP2D 24 


8918 


16^77 


3463 






784CIP2D 25 


8918 


1678 


3464 




7036 


784CIP2D 26 


8941 


1675 " 


346"5 




7037 


7B4CIP2D 27 


6941 


1680 


3466 




7038 


784CIP2D 28 


8951 


1^81'"" "' 


J ** o / 




7039 


784CIP2D 29 


8951 


* 1682 


J * DO 




7040 


784CIP2D_30 


9007 


1683 




5255 


704.1 


784CIP2D 31 


9012 


1684 




5256 


7042 


784CIP2D 32 


9013 


TTfl^ 




5257 


7043 


784CIP2D 33 


9025 




34.72 


5258 


7044 


784CIP2D 34 


9053 




3473 


5259 


7045 


784CIP2D 35 


9054 


1688 


3474 


5260 


7046 


784QIP2D 36 


9054 


1689 


3475 


5261 


7047 


784CIP2D 37 


9113 


1690 


3476 


5262 


7048 


784CIP2D 38 


9134 


1631 


3477 


5263 


7049 


784CIP2D 39 


9152 




3478 


5264 


7050 


784CIP2D 40 


9152 




3479 


5265 


7051 


7B4CIP2D_41 


9211 


1694 


3480 


5266 


7052 


784CIP2D 42 


9223 




3481 


5267 


7053 


784CIP2D_43 


9223 


1696 


| 34 82. 


5268 


7054 


784CIP2D 44 


9231 j 


1697 


3483 


5269 


7055 


784CIP2D 45 


9236 




3484 


5270 


7056 


784CIP2D_46 


9236 


1699 


3465 


5271 


7057 


784CIP2D 47 


93 03 


1700 


34 86 


5272 


7058 


7B4CIP2D_48 


| 9309 


1701 


34 87 


5273 


7059 


7B4CIP2D 49 


9314 


1 7AO 


3488 


5274 


7060 


784CIP2D_50 


9326 


i *7ni 

X /Ui 


34 89 


5275 


7061 


784CIP2D_51 


5339 


X / u 1 * 


34 90 


5276 


7062 


784CIP3b_52 


9348 


1 70S 


3 4 91 


5277 


7063 


784CIP2D" 53 


9376 


i inc. 

X / Ub 


3492 


5278 


7064 


784CIP2D_54 


9382 


1707 


3493 


5279 


7065 


784CIP2D 55 


9407 


1708 


3494 


5280 


7066 


784CIP2D_56 j 


9414 


1709 


3495 


5281 


7067 


784CIP2D 57 


9439 


1710 




5282 


7068 


784CIP2D 58 


9485 


1711 


3497 


5283 


7069 


784CIP2D 59 


9493 


1712 


J*k JO 


52 84 


7070 


784CIP2D_6 0 


9501 


1713 


«j ■* ^ ? 


5285 


7071 


784CIP2D 61 


9526 


1714 


^500 


OZBO 


7072 


784CIP2D 62 


9526 


1715 


3501 


COOT 


7073 


784CIP2D 63 


9551 


1716 


3502 


COSQ 


7074 


784CIP2D 64 


9557 


1717 


3503 


WHQ 


7075 


784CIP2D 65 


9568 


1718 


3504 


52<?n 


7076 


784CIP2D_66 


9588 


1719 


3505 


5291 


7077 


784CIP2D 67 


9597 


1720 


3506 


5292 


7 078 


784CIP2D 68 


9615 


1721 


3507 


C7Q-3 " ' " - 


7079 


784CIP2D 69 


9628 


1722 


3508 




7080 


784CIP2D 70 


9649 


1723 


3509 




7081 


784CIP2D 71 


9652 


1724 


3510 




7082 


784CIP2D 72 


9660 


1725 


. 3511 


5297 


7083 


7S4CIP2D 73 


96 £2 " 


1726 


3512 


5298 


7084 


784CIP2D 74 


9725 


1727 


3513 


5299 


7085 


784CIP2DJ75 


9746 


1728 


3514 


5300 


7086 


784CIP2D 76 


9777 


1729 


3515 


5301 


7087 


784CIP2D_77 


9787 


1730 


3516 


5302 


7088 


784CIP2D 78 


9790 


1731 


3517 


S303 


7089 


784CIP2D 79 


9842 


1732 


3518 


5304 


7090 


784CIP2D 80 


9842 


1733 


3519 


53 05 


7091 


784CIP2D 81 


9848 
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SEQ ID NO: 

of full- 

nucleotide 
sequence 


SEQ Id 
NO : of 

EUil - 

cent irtr» 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priorit/ 
docket nutnber_ 
corresponding 
SEQ ID NO: in 
priority 
Application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1734 


3520 


530"* 


70^2 




9867 


1735 


3521 


5307 


7093 


/04LIP2D 83 


, 10010 


1736 


3S22 


5308 


7094 


/04CIP2D 84 


10011 


1737 


3523 


5309 


7095 


/o9v_lP2D 85 


10 052 


1738 


3524 


5310 


7096 


/o4Q»IP2D 86 


10057 


™ 1739 


3S25" " 


5311 


/ / 


/o4t.IF2D 87 


10085 


1740 


3526 


5312 




7B4CIP2D 89 


10139 


1741 


3527 




7099 


7B4CIP2D 90 


10l42 


1742 


3528 


5314 


•ji /\ a 

/luU 


784CIP2D 92 


10165 


1743 


3529 


" "Slit"" 


7101 


7B4CIP2D 93 


10173 


1744 


3 Sin 


3JiO 


7102 


784CIP2D 94 


10173 


1745 


3531 


D J ± I 


7103 


7B4CIP2D 95 


10273 


1746 


3 532 


cTTT 


7104 


784CIP2E 1 


3121 


1747 


3533 


: — cot a 


7105 


704CIP2E 2 


3628 


1748 




bJ20 


7106 


784CIP2E 4 


3673 


1749 




5321 


7107 


7B4CIP2E_5 


4018 


1750 




5322 


7108 


784CIP2E 6 


4467 


1751 


^ en 

J JO / 


5323 


7109 


784CIP2E 7 


4865 


1752 




5324 


7110 


784CIP2E 8 


4916 


1753 


J 1>3 9 


5325 


7111 


784CIP2B_9 


4923 


1754 


3540 


5326 


7112 


784CIP2E 10 


4926 


i / jj 


Ji>41 


5327 


7113 


784CIP2B 11 


4962 


1756 """ 




5328 


7114 


784CIP2E_12 


4963 




3543 


5329 


7115 


784CIP2B_13 


4964 


1758 


3544 


5330 


7116 


784CIP2E_14 


4988 


J. /:>:* 


3545 


5331 


7117 


784CIP2E_15 


5635 


1760 


3546 


5332 


7110 


784CIP2E_16 


7682 


1.761 


3547 


5333 


7119 


784CIP2E_17 


7682 


1762 




5334 


7120 


784CIP2E 18 


7699 


1763 


3549 


5335 


7121 


784CIP2E 19 


7707 


1764 


J 33 U 


533 6 


7122 


784CIP2E 20 


7707 


1765 


J 9g X 


5337 


7123 


784CIP2E_21 


7752 


1766 


J J J6 


5338 


7124 


784CIP2E_22 


8357 


1767 




5339 


7125 


784CIP2E_23 


9065 


1768 




5340 


7126 


784CIP2E 24 


9324 


1769 


3 555 


5341 


7127 


784CIP2F 1 


2976 


1770 


J J JO 


5342 


7128 


784CIP2F 2 


3559 


1771 


3557 


5343 


7129 


784CIP2F 3 


4021 


1 1772 


3558 


C t A A 


7130 


784CIP2F_4 


4474 


1775 


3559 


OAK. 


7131 


784CIP2F_5 


4566 


1774 


3560 


5346 


7132 


784CIP2F 6 


4705 


1775 


3561 


ax a *7 


7133 


784CIP2F 7 


4707 


1776 


3562 


5348 


7134 


784CIP2F 8 


4712 


1777 


3563 




7135 


784CIP2F 9 


5008 


1778 


3564 


5350 


7136 


784CIP2F 10 


5009 


1779 


3565 


5351 


7137 


784CIP2F_11 ) 


5015 


1780 


3566 


5352 


7138 


734CIP2? 12 


5015 


1781 


3567 


5353 


" 7139 


784CIP2F 13 


7724 


1782 


356$ 


5354 


7140 


784CIP2F 14 


7725 


1783 


3569 


5355 


7141 


734CIP2F 15 


8828 


1784 


3570 


5356 


7142 


784CIP2F 16 


™" 8830 


1785 


3571 


5357 


7143 


784CTP2F 17 


9739 


1786 


3572 


5358 


7144 


784CIP2F 18 1 9896 



TRADOCS:I416247.I(%CS701LDOC) 
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TABLE 7 



SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

GLlllXLiU QblU 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firBt 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, DsAspartic Acid, E= 
Glutamic Acid, F^Phenylal amine, G=Glycine, 
HsHistidine, I«Isoleucine, K« Lysine, 
L-Lcueine; M»Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=S erine, T«=Threonine, VWValine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 


1131 


AHLSARLSALILDEVAILPAPQNLSVLSTNMKHLLMWSPVIAPG 
ETVYYSVBYQGEYESLYTSHIW2PSSWCSLTEGP3CDVTDDITA 
TVP YNLRVRATLGSQTS /CLEHP /VS I PLIETQ PSLPDL /RME I 
TKDGFHLVIBLEDLGPQFE FLVAYWRRBPGAEEHVKMVRSGGI P 
VHLETMEPGAAYCVKAQTFVKAIGRYSAPSQTBCVEVQGEAIPL 
VLALFAFVGFMLILVVVPLFVWKMGRLLQ/YLLLPRGGSSQTPW 
KITQF 


5360 


2 


1115 


PRVRSSGGQEDPASQQWARPRFTQP3KMRRRVIARPVGSSVRLK 
CVASGHPRPDITWMKDDQALrRPEAAEPRKKKWTLSIiKNLRPED 
SGKYTCRVSNRAGA INAT YJCVDVI QRTRSKP VLTGTHP VNTTVD 
FGGTTS FQCKVRSDVKPVIQWLKRVEYGAEGRHNSTI DVGGQKF 
WLPTGDVWS RPDGS YLN KLLI TRARQDDAGM YI CLGANTMGYS 
FRSAFLTVLPDPKPPG PP VASSSSATSLPWPWI GI PAGAVFIL 
GTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALSAGPGVGtfCEEHGSPAAPQHLLGPGPVAGPKLYPKLYTGHS 
TPHTYTHPPPSCQLNSSHS 


5361 


3 


925 


HEGSISSANILLDDQFQPKLTDFAMAH?RSHLEHQSCTINMTSS 
SSKELWYMPEE YIRQGKLSI XTDVYS FGI VIMEVIiTGCRVVLDD 
PKHIQLHDIiLRELMEKRGLDSCLSFLDKKVPPCPRMFSAKLFCL 
AGRCAATRAKIiRPSMDBVLNTLESTOASLYFAEDPPTSLXSFRC 
PSPLFLENVPS I PVEDDESQNNNLLPS DEGLRIDRMTQKTPFEC 
SQSEVMFLSLDKKPESKRNEBACNMPSSSCEESWFPKYIVPSQD 
IiRPYKVNIDPSSEAPGHSCRS RPVBS SCSSKFSWDEYEQYKKE 


5362 


2 


4879 


SCQVEGCTRTYNSSQSIGKHMKTAHPDQYAAFKMQRKSKKGQKA 
NNLKT PNNSKF VYFLPS PVNS SNPFFTSQTKANGNPACSAQLQH 
VSPP I FPAHLASVSTPLLSSMESV.1NPNITSQDKN2QGGMLCSQ 
MENLPSTALPAQKEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 
F PS PADSGTNSVFS QLENNTNHYS SQ IEGNTNS S FLKGGNGENA 
VFPSQVNVANK FSSTNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKWPAI IRDGKFI CSR CYRAFTNPRS LGGHLS KRS YCKPLDGA 
E I AQ ELLQSNG QP S LLASM I LSTNAVNLQQPQQS TFNPEACFKD 
PSFLQLLAENRSPAFLPNT FPRSGVTNFNTSVSQEGSE1 1 IQAL 
ETAG IPS TFEGAEMLS HV5 TG CVSDASQVNATVM FNPTVPPLLH 
TVCHPNTLLTNQNRTSNS KTS S I EECS SLPVFP TNDLLLKTVEN' 
GLCSSSFPNSGGPSQNFTSNSSRVSVISGPQNTRSSHLNKXGNS 
ASKRRKKVAPPIilAPNASQNLVTSDLTTMGLIAKSVEIPTTNLH 
SNVI PTCB PQS LVENLTQKLNNVNNQLFMTD VKENFKTSLESHT 
VLAPLTL KTENGDS QMMALNSCTTSVNSDLQI SEDNVI QNFEKT 
LEIIKTAMNSQILEVK5GSQGAGETSQNAQINYNIQLPSVNTVQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKEDQIQEILEGL 
QKLKLENDLSTPASQCVLINTS VTLTP TP VKSTADI TV1 QPVS3 
MINIQFNDKVNKPFVCQNQGCNYSAMTKDALFKHYGKIHQYTPB 
MILEIKKNQLKFAPFKCVVPTCTKTFTRNSNLRAHCQLVHHFTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVIQCQLAMTBENKKESQ 
PALELRAETQNTHSNVAVI PEKQLI EKKSPDKTES S LQVITVTS 
EQCNTNALTNTQ TXGRKIRRHKKEKEE KKRKKPVS QS LEFPTR Y 
S PYRP YRCVHQGCFAAFTIQQNLI LH Y QAVHKSDL PAFS AEVEE 
ESEAGKESEETETKQTLKEFRCQVSDCSRI FQAI TGLIQHYMKli 

HG IGLRASICTEEDG VYXCDCEG CDR I YATRSNLLRHI FNKHNDK 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 
KMPKTKPJOCra^ENKNAiaVQIEBNKPYSLKRGKilVYSIKARK 
DALSECTS RFVTQYP CMIKGCTS WTSESNI IRHYKCHKLSKAF 
TSQHRNLLI VFKRCCN S QVKETSEQEGAKNDVKDSDT CVSESND 
NSRTTATVS QKEVE KNE * DEMDELTELFITKli INEDSTS VETQA 
NTS SNVSNDFQEDNL CQSERQKASNLKRVNKEKNVSQNKKRKVE 
KAEPASAAELSSVRKEEETAVAIOTIEEHPASFDWSSFKPMGFE 
VSFLKFLEBSAVKQKKNTDKDHPNTGNKKGSHSNSRKN1DKTAV 
TSGNHVCPCKESETFVQFANPSQLCCSDNVKIVLDKNLKDCTEL 
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SEQ — 

ID 

NO: 


Pref3i<?h eel 

be9i.Hlli.Ilg 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid se^mant containing signal pept-ide 

Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=>Isole\icine, K=Lysine, 
L=Leucine, M»Methionine, NaAsparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S«Se rine, TssThreonine, VsValine, 
W=Tryptophan, Y«Tyrceine, X=Unknown # *=Stop 
Codon, /-possible nucleotide deletion, 
\aposaible nucleotide insertion) 








VfcKQLQEM K PTVS L KKLE VHSND PDMS VM2G>I S IGKATGRGQ Y 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRPvEANLVATCLPVRASLPHRIiNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKROAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQ INQQWERTYLGNALVCTCYGGSRGFNCES K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
EWKCERHTS VQTTS SGSGPFTDVRAAVYQPQ PHPQ P PP YGHCVT 
DSGWYS VGMQLA* KTQGNKQML \ CTCXGNGVSCQ BTAVTQTYG 
GNSNGBPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTraTVLVQTRGGNSNGALCHFPFLYNNHWYTDCTSEGRR 
DNMKWCGTTQNYDADQKFGFC PMAAHEE ICTTNEG VM YRI GDQW 
DKQHDMGHMMRCTCVGNGRGE WTCI AYSQLRDQCI VOD I T YNVN 
DTFHKRHEEGHMLNCTC FGQGRGRWKCD P VDQCQDS ETGT FYQ I 
GDSWBKYVHGVRYQCYCYGRGIGEWHCQPLOTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLN5YTIKGLKPGWYEGQLISIQ0YGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFS PLVATSES VTE1TAS SFWS WVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\N I P\DLLPGRKYI VN 
VYQI SEDGEQSLILST8QTTAPPAPPDPTVD0VDDTSI WRWSR 
PQAP ITGYR I VYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGGRLPLSRNTF\AEN 
TGI*S PGVTYY FKVFAVSHGRES KPLTAQQTTKL \ DAPTNLQFVN 
ETDSTVI>VRWT PPRAQ I TGYRLT VGLTRRGQ PRQ¥NVGPS VSKY 
PLRNLQPAS EYTVS LVAI KGNQES PKATG VFTTLQ PGS S I PPYN 
TE VTETTI V I TWTPAPRI GFKLG VR PSQGGEAP R EVTSDSGS I V 
VSGLTPG V B YVYTI QVLRDGQERDAP \ IVNK \ WTPLS PPTNLH 
LEANPDTGVIjT VS W ERSTTPD ITGYR ITTT PTNGQQGNS LEEW 
HADQSSCTF \DNLE VPGL3 YNVS VYTVKDDKES VP I SDT 1 1 PAV 
PPPTDLRFTN/ ILGPDTMRVTW\AP PPS IDLTNFLVRYS PVKNB 
GRMLQSLS I FFLSDN\AWLTNXLPGTEYWSVSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\D1TA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\ SGRPREDR\VPHSRNS ITLTNLTPGTEYW 
SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLlXSWD 
APAVTVRYYR IT YGETGGNSPVQE FTVPGSKSTATI SGLKPGVD 
YTI TVYAVTGRGDS PAS 3 KP ISI NYRTE IDKPS QMQVTD VQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQPTVEYWS VYAQNP SGES Q PLVQTAVTNIDR P KGLAFTD V 
DVDS IKIAWESPQGQYSRYRVTY5SPEDGIH ELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKPT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SVWSGLMVATKYEVSVYALKDTLTS1PAQGWTTLENVSPPRR 
ARVTDATETTITISWRTKrETITGFQVDAVPANGQTPlQRTIKP 
D VRS YT I TGLQ PGTD YKI YL YTLNDNARSS P WI DAS TA I DAP S 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

RDfiVPJ7ZVTTTf!r.F DflTPYTTVtf T JlT VXTMO IT C TTDT. "!"/"• O VVTT^tPT n 
i\f vj v idrti x louciWiEil 1 A X V XJ\ijrSi\WjD^ C.fijJ.taKR.h.lL'ttlji' 

QLVTLPHPNLHG PE I LDV PSTVQKT P FVTKP G YDTGNG I QL PGT 
SGC3QPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQB 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI IVEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSC FDPYTVSH YAVGDEWERMSES G FKLLCQCLGFGSGHFRCD 
S S RW CHDNGVN Y KI G E KW DRQG ENGQMMS CTCLGNG KGE FKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGE PS P EGTTGQS YNQYSQRYHQRTNTNVNCP IECFMPLDVQ 
ACRED SRE 


5364 


8066 


703 


R LCCTGGGEGTPGASG KRGPAATTS L VLC I PS VP PPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANIjVATCIiPVRASLPHRI.NML 
RGPGPGLLLIiAVt*CLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
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SEQ 
ID 
NO: 


Predicced 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=*Asparagine , 
P^Proline, Q«Glutamine, R=Arginine, 
SnSerine, TaThreonine, VwValine, 
W=Tryptophan, Y=Tyrosine, XcUnknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSKPGCYDNGIOIxQiNOQWElRTYLGj^ALVCTCyGGSJlGFNCESlC " 
PEAEETCFDKVTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDlIAAGTSYVVaETWEKPYO^WMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
EWKCERHT8VQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA * KTQGNTKQMlA CTCLGNGVSCQETAVTQTYG 
GNSeXGEPCVLPPTYNGRTPYSCTTEGRQDGHLMCSTTSNYEQDO 
KYSFCXDHTVIiVQTRGGKSNGALCHFPFLYNNHNYTDCTSEGRR 
DKMKWCGTTQNYDADQKPGFCPMAAH3EI CTTN3GVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCI AYSQLRDQC IVDD I TYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDS WBKYVHGVR YQC YCYGRG IGEWHCQPLQTY ? S SSG P VEVF I 
TETPSQPNSHP IQWNAPQPSH ISKY ILRWRPKNSVGRW KEATIP 
GHLNS YTI KGLXPG WYEGQLISI QQYGHQEVTR FDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SG FRVEYEL S EEGDB PQ YI> VLPS TATS V\NI P \ DLLPGR KYI VN 
VYQ I S EDGEQSLI LSTSQTTAPDAPPD PT VDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGGRLPLSRNTF\AEK 
TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 
ETDSTVLVRWTPPRAQ I TG YRLTVGLTRRGQ PRQYNVG PS VS KY 
PLRNLQPAS E YTVSLVAI KGNQESPKATGVFTTLQPGSS I PPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGBAPREVTSDSGSIV 
VSGIiTPGVEYVYTIQVLRDGQERDAP \ IVNK\ WTPLSPPTNLH 
LBANPDTGVIiTVSWERSTTPDITGYRITTTPTNGQQGNSIiEEVV 
HADQSSCTF\DNLEVPGLEY5JVSVYTVKDDKESVP I SDTI I PAV 
PPPTDIiRFTN/ILGPDTMRVTW\APPPSIDriTNFLVRYSPVKNE 
ORMLQSLS IFFLSDN\A WLTNLIiPGTEYWSVSS VYEQHES TP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\ VHW\ IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTWLTPGTEYW 
SIVALNGREES PLLIGQQSTVSDVPRDLEWAATPTSUuI \SND 
APAVTVRYYRI TYGETGGNS PVQEFTVPGSKSTATISGLKPGVD 
YTITV YAVTGRGDSPASSKP IS INYRTE I DKPS QMQVTDVQDNS 
ISVKWLPSSS PVTGYRVTTT\ PKNGPG\ PTKXKTAGPDQTEMTI 
EGLOPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGliAFTDV 
DVDS I KIAWES PQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
BLQOLRPGSEYTVSWALHDDMESQPLIGTQSTAXPAPTDLKFT 
QVTPTSLSAQWTPPNVQliTGYRVRVTPKEKTGPMKE INLAPDSS 
SVWSGLMVATKYBVSVYALKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETTITIS WRTKTETI TGFQVDAVPANGQTP I QRTI KP 
DVRS YT I TGLQ PGTDYKI YLYTLNDNARS SP WIDAS TAIDAP S 
NLRFIiATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLE PGTEYT I YVT ALKNNQ KSBPI*I GR KKTDELP 
QLVTIiPHPNIJIGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQ PS VG QQM I FE EHGFRRTT P PTTATP IRHRPRP Y P PNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI IVEALRD QQRHKVRE E WTVGNS VNEGLNQPT 
DDSCFDPYTVSHYAVGDBKERMSESGFKIiLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCIjGNGKGEFKCDP 
HEATC YDDGKT YH VGEQWQKE YLGAI CSCTC FGGQRGWRCDN CR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNWCP I ECFMPLDVQ 
ADREDSRE j 






703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCI PS VPP PVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQINQQWBRTYIiGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGWGKGEWT 
CXPIAEKCFDHAAGTSYWGETWBKPYO^WMMVDCTCLGEGSGR 



302 



WO 01/53312 



PCT/US00/34263 



| SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D^Aspartic Acid, E*= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«£»ysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P-Proline, Q=Glutamine , RsArginine, 
S=Serine, T=Threonine , V*Valine, 
WnTryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apossibie nucleotide insertion) 








I TCrSRNRCNDQDTRTS YRIGDTWSKKDNRGNLLQCI CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA * KTQGNKQML \ CTCLGNG VS CQE T AVTQTYG 
GNSNGEPCVLPPTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS ?CTDHTVLVQTRGGNSNGALCHFPFLYNNHbnfT0CTSBGRR 
DNMKWCGTTQN YDADQKPG FCPMAAHBE I CTTNEGVMYR 1 GDQW 
DKQHDMG HMMRCTCVGNGRGEWTC I AYSQLRDQCI VDDI TYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GUSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVBVFI 
TETPSQPNSHPIQWMAPQPSHISKYILRWRPKNSVGRWJCBATIP 
GHLNSYTlKGLKPGVVYEWQLlSlQQYGliQEVTRFDFTTTSTST 
P VTSNT \ VTGETTP FS PLVATS ES VTEITASSF WSWVS ASDTV 
SGFRVBYELSEEGDEPQYIiVLPSTATSV\NIP\DZ,LPGRKYIVN 

vyqis edgeqsiii l*stsqttapdappdptvpqvddts i wrwsr 
pqap i tgyr i vys ps vegs stelnlpetans vtiis dlqp gvq yn 
i ti ya veenqes tp wtqqbttgt prsdt vps prdlqfve vtd v 
kvtimwtppesavtgyrvdvipvnlpgehgqrlplsrntfVaen 

TGI>SPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAI>TNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPASE YTVSLVAI KGXQES PKATGVFTTLQPGSS IPP YN 
TEVTETTIVITWTPAPRIGFKI/3VRPSQGGBAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNDH 
LEANPDTGVLTVS WERS TT PDI TG YRI TTTPTNGQQGKSLEEW 
HADQS SCT F \ DN L F. VPG LE YMVS VYTViODDKBS VPI SDT I IPAV 
PPPTDIJiFTN/ILGPDTMRVTW\APPPSIDLTNFI>VRYSPVlQJE 
GRMLQS LS I FFLS DN\AWl,TNLLPGT3 YWSVSS VYEQHE S TP 
\LRGRQKTGLDS P \TGIDFS \ DITA\NS FT\ VHW\ I APRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
SIVALNGREES PLLIGQQSTVSDVPRDLEWAATPTSIiLI \SND 
APAVTVRYYRI T YGETGGNS P VQEFTVpGS KSTAT I SGLKPG VD 
YTITVYAVTGRGDSPASSKPISIWYRTEIDKPSOMQVTDVQDNS 
ISVKWLPSSS PVTGYRVTTT\ PKNG PG \ PTKTKTAGPDQTEMT I 
EGLQPTVEYWS VYAQNPS GESQPLVQTAVTNI DRP KGLAFTDV 
DVDSIKrAWESPQGQVSRYRVTySSPEDGIHELFPAPDGBEDTA 
ELQGLRPGS EYTVS WALHDDMESQPLIGTQSTAI PAPTDLKFT 
QVTPTS LSAQWTP PNVQLTGYRVRVTPKEKTGPM KE INIiAPDS S 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 
DVRS YT ITGLQPGTDYKI YLYTLNDNARSS PWI DASTAIDAP S 
NLRFLATTPNSLLVSWQPPRARITGY1 IKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDBLP 
QLVTLPHPNLHG PE ILDVPS7VQ KTPFVTHPG YDTGNG IQL PGT 
SGQQPSVGQQM3 FEEHGPRRTTP PTTATP IRHRPRPYPPNVGQE 
AtjSQTTI 3 WAPFQDTSE YI I S CH PVGTDEEPLQFRVPGTSTSAT 
LTGLTRGAT YN 3 IVEALKDQQRHKVRBEWTVGNSVNEGLNQPT 
DDSCFDP YTVSKYAVGDEWBRMS ESGF KLLCQCLG FGS GH FR CD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCI)NCR 
RP GGE P S P EGTTGQS YNQYSQR YHQRTNTNVNCP I ECFM PLDVQ 
AOREDSRB 


53^ 


6066 


703 


RLCCTGGGEGTPGASGKRGPAATTS IjVLCI PS VPPPVP FPTLWP 
P PS WRRQP PGGIRRDFSRRLR RE ANLVATCLP VRASLP HR IiNML 
RGPGPGIiLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQS PVAVS 
GSKPGCYDNGKHYQINQQMERTYLG>IALVCTCYGGSRGFNCESK 
PEAEETCFDXYTGNT YRVGDTYE RPKDSM I WDCTCI GAGRGRI S 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKP I ABKCFDHAAGTS YVVGETWEKP YQGWMMVDCTCLGEGSGR 
ITCrSRNRCNDQDTRTSYRIGDTWSKKDNRGNLliQCICTGNGRG 
E WKCERHTS VQTTSSGS G PFTDVRAAVYQPQ PHPQP P PYGHCVT 
DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
GIJSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSrrSWYEQDQ 
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SSQ 
IV 

.KO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, FaPhenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
I»=Leucine, M«Methionine, N=Asparagine, 
P^Proline, QsOlutamine, R»Arginine, 
S*Serine, ^Threonine, V^Valine, 
W=Tryptophan, Y«=Tyrosine, X-Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KYSFCTDHTVLVQ'rRGG^NGALCHFP&LYNNH^YTDCTSEGRR 
DNM KWCGTTQNYDADQKFGFCPMAAHEB I CTTNEG VM YR IGD QW 
DKQHIWGHAIMRCTCVGNGRGEWTCIAYSQIiRnQCI VDD ITYNVN 
DTFHFOTEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGV^YQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YTI KGLKPG WYEGQL I S I QQYGHQEVTRFDFTTT S TST 
PVTSMT\VTGETTPFSPIiVATSESVTEITASSFWSWVSASDTV 
SGFRVEYEI*SEEGDBPQYIiVLPSTATSV\NlP\DLLPGRKYIVN 
VYQI S BDGEQSLILSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
PQAPITGYR I V YS PS VEGS STEI/NLP ETANS VTLSDLQPG VQYN 
ITIYAVEENQESTPWIQQBTTGTPRSDTVPSPRDLQFVEVTDV 
KVTI MWTPPESAVTGYRVDVI PVOTjPGEHGQRLPLSRNTF VftEN 
TGLS PGVTYYFKVFAVSHGRESKPIiTAQQTTJCL\DAPTNLQFVN 

etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrnlqpase ytvslvai kgnqes pkatgvfttlqpgss ippyn 
tevtettivitl^tpaprigfklgvrpsqggeaprevtsdsg£5iv 
vsgltpgvbyvytiqvlrdgqerdap\ivnk\wtplspptnlh 

LEANPDTGVLTVSWERSTT?DITGYRITTTPTNGQQGNSL,EEW 
tC\DQSSCTP\DWLEVPGLEYNVSVYTVKDDKESVPlSDTIiPAV 
PP PTDIiRFTN/ 1 I*GPDTMRVTW\ APPP S I DLTNFLVR YS PVKNE 

grmlqslsiffi*sdn\awltnllpgteywsvssvykqhestp 
\lrgrqktgldsp\tgidfs\dita\nsft\vhn\iapra/tpi 
tg yr i r \hhp bhf \sgrpredr\ vphsrns i tltnlt pgte yw 
s i valngrees plligqq stvsdvprdlewaatptsll i \ s wd 
apavtvr yyr i tygetggns p vqeftvpgs ks tatisgucpg vd 
yt itvyavtgrgds pas skp i s inyrte idk p sqmqvtdvqdns 
isvkwlpssspvtgyrvttt\pkngpg\ptktktagpdqtemti 

EGLQPTVEYWSVYAQNPSGESQPLVGTAVTNIDRPKGIAFTDV 
DVDSIKIAWESPOGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSBYTVSWALHDDMESQPLIGTQSTAIPAPTDLXFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPXEKTGPMKEINLAPDSS 
S VWSGLMVATKYE VS VYALXDTLTS RPAQGVVTTLENVS PPRR 
ARVTDATETTI TI S WRTKTETITGFQVDAVPANGQTPI QRTIKP 
D VRS YT I TGLQ PGTD YKI YLYTLNDNARSS P WI DASTA I DAPS 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKWNQKSEPLIGRKKTDELP 
QLVTIipHPNLHGPE ILDVPSTVQKTPFVTHPGYDTGNG IQLPGT 
SGG^PSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQS 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGAT YN 1 1 VEALKDQQRHKVREE WTVGNSVNEGLNQPT 
DDSCFDPYTVSKYAVGDEKERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICS CTCFGGQRGWRCDNCR 
RPGGE PS PEGTTGQS YNQYS QR YHQRTNTNVNCP I E CFMPLDVQ 
ADREDSRE 


5367 ( 


235 


3591 


KKILNMLCKKNX VIEYLAD3C LYEYLYG FCFSGIKKYLTThVLRIT"'" 
I LELWNTRLLI<EKS VSLQTQYLLLIVKIIiS WFPGKEMRHHLQ I M 
E VMMRKQDS / RIVGNGSEQQLQKELADVLMDPPMDDO PGBKELV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLQKPBMSLPVKPGO 
GDSE ASSPFT PVADEDS WFS KLTYLGCAS VNAPRS E VE ALRMM 
S ILRSQCQI SLD VTLS VPNVSEGI VRLLDP QTNTE IANYPIYKI 
ur uvnviUAyi rwULiAr Ii£iHYNAEL»FRIHVrTtCEIQEAVSRI 
LYS PATAFRRSAKQTPLSATAAPQTPDSD I FTFSVSLE I KEDDG 
XGY FS AVP KDKDR QCFKLRQG I DKKIVI YVQQTTNKE LA I ER CP 
GLLLSPGKDVRNSDMHIiLDLESMGKSSDGKSYVITGSWNPKSPH 
FQWNEET P KDKVLFMTTAVDL VITE VQBP VRFLLETKVRVCS P 
NERIiFWPFSXRSTTENFFLKLKOIKORBRKNNTDTLYEWCLBS 
ES ERERRKTTASPSVR LPQSGSQSS VI P S P PEDDEE EDNDEPLL 
SGSGDVSKECAEKILETNGEI.LSKWHUTLNVRPKQLSSLVRNGV 
PEALRGBVWQLLAGCHNNDHLVEKYRILITKESPQDSAITRDIN 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, B» 
Glutamic Acid, FoPhenyl alanine, G=Glycine, 
HcHistidine, I«Isoleucine, K-Lysine, 
L^Leucine, M«Methionine, N=Asparagine, 
P«Proline, Q=Qlutamine, R=Arginine, 
S=Serine, T=Threonine, V-valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *oStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTFPAHDYFKDTGGDGQDSLYKICKAYSVYDEEIGYCQGQSFIiA 

RLMQE YI PDI» YNHFLD ISLEAHMYASOWFLTLFTAKFPIiYMVFH 
I IDLLLCEG IS VIFKVALGLLKTSKDDLW.rDFEGALKFPRVQL 
PKRYRSEENAKKLMBLACNMKISQKKLKKYEKEYHTMREQQAQQ 
EDP t ERFERENRRLQEANMRLEQENDDLAHELVTS KIALRKDLD 
NAEEKADALNK3LLMTKQXLIDAEEEKRRLEEESAHLKKMCRRE 
LDKAESEIKKNSSIIGDYKQICSQIjSERIiEKQQrANKVEIEKIR 
QKVDDCERCREFFNKEGRVKGISSTKEVLDEDTDEEKETLKNQL 
REMEI.ELAQTKI A \QLVEASCK1QD\LEHPF*GLPFKE\VQAA\K 
KTWFNRTLSS I KT ATGVQG KETC 


5368 


573 


2014 


GAAAGAAD P RRG S 1LGGRTMLDFAI PAVTFLLALVGAVLYLYPAS 
RQAAG I PG I TPTBEKDGNLPD 1 VNS G S1»HE FLVNLHERYG P WS 
FWFGRRLWSLGTVOVLKQHIWPNKTLD/r,F*NHAEVI I JCVSIW 
WWQCE * KP\ QRKKLYENGVTDSLKSNFALLLXLPEELLDKWI*S Y 
PETQH\VPLSQHMLGFAMKSVTQMVMGSTFEDDQEVIRFQKNHG 
TVWSE IGKG FLDGSLDKKMTRKKQYEDALMQLES VLRN I IKERK 
GRNFSQHIFIDSLVQGNLNDQQILED3M3 FSLASCIITAKLCTW 
A I WFLTTS EEVQKKL YBE INQ VFGNGP VTPEKI EQLRYCQH VLC 
ETVRTAKLTPVSAQLQDIEGKIDRFI IPRETLVLYALGWLQDP 
NTWPSPHKFDPDRFDDBI*VMKTFSSIjGF8GTQECPELRFAYMVT 
TVX.LSVLVKRLHLLSVEGQVIETKYELVTSSREEAWITVSKRY 


5369 


1 


6622 


PRSLCFSLWAEAAVLADGGIiRRRRRLLRGTMSASFVPNGASLED 
CHCNLFCLADLTGIKWKKYVWQGPTSAPILFPVTEEDPILSSFS 

rclkadvlg/vwrrdqrperrbNl* IFWGGEDP\ VLLTLFTMTY 

QKKKME CGRMDF PMNAVL CFSKAVHKfLLERCI»MNRNFVR I G KWF 
VKPYEKDEKPINKSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 
LLSEEHITLAQQSNS PFQVI LCPFGI*NGTLTGQAFKMSDSATKK 
LIG EWKQFYP I S CCLKEMSEEKQEDMDWEDD SLAAVE VLVAGVR 
MIYPACFVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPASTRDPA 
MSSVTLTPPTS PEEVQTVDPQSVQKWVKFSS VSDGFNSDSTSHH 
GGK1 P R KLANHVVDR VWQE CNKNRAQN KRKYSA S S GGLCEE ATA 
AKVASWDFVBATQRTNCS CLRHKNLKSRNAGQQGQAPSLGGQQQ 
ILPKHKTNEKQEKSEKPQKRPLTPFHERVSVSDDVGMD\ADS\A 
SQRLVN I SAP \DS Q\ VRFSNI R\TNDVAK\ 7PQMHGTEMANSPQ 
PPPLSP\HPCDWDEGVTKT?STPQSQHFYQMPTPDPLVPSKPM 
EDR3 DSLSQSFPPQYQEAVEPTVYVGTAVNLEEDEANIAWKYYK 
FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 
PLKVSDELVQQ YQI KNQCLSAIAS DAEQEPKIDP YAFVEGDEEF 
LFPDKKDRQNS EREAGKKHKVEDGTS SVTVLSHEEDAMS LFS PS 
IKQDAPRPrSKARP PSTS L I YDSDLAVS YTDLDNL FNSDEDELT 
PGSKRSANGSDDKASCKESKTGNLDPLSCISTADXHKMYPTPPS 
LEQKIMGFSPMNMKNKEYGSMDTTPGGTVLEGNSSSIGAQFKIE 
VDEGFCS P KPS B 1 KDFS YVYKPENCQ I L VGCSM FAPLKTliPSQY 
LPIiIKLPEECIYRQSWTVGKLELLSSGPSMPFIKEGDGSNMDQE 
YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 
RTPRTPRGAGGPAS AQGS VKYENSDLYS PASTPSTCRPLNS VE P 
ATVPS I PEAHSLYVNLILSESVMNLFKDCNSDSCCI CVCNMNIK 
GADVGVYI PDPTQEAQYRCTCGFSAVMNRKFGNNSGLFFEDELD 
IIGRNTDCGKEAEKRFEALRATSAEHVNGGLKESEKLSDDL ILL 
LQDQCTNLFSrFGAADQDPFPKSGVISNWVRVEERDCCNDCYLA 
LEHGRQFMDNMSGGKVDEALVKSSCLHPWSKRNDVSMQCSQDtL 
RMLLSLQPVLQDAIQKKRTVRPWGVQGPLTWQQFHKMAGRGSYG 
TDESPEPLPI PTFLLG YDYD YLVLSP FALPYWERLMLEP YGS C2R 
DIAYWLCPENEALT»NGAKS FFRDLTA I YESCRLGQHRPVSRLL 
TDGIMR VGS TASKKLSEKLVAE WFSOAADGtJNEAFSKLKLYAQV 
CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 
NTPSATLASAASSTMTVTSGVAISTSVATANSTLTTASTSSSSS 
SNtWSGVSSNKLPS FPPFGSMNSNAAGSMSTQANTVQSGQLGOQ 
QTS ALQTAG I SGBS SSLPTQPHPD VSESTMDRDKVG I PTDG05H 
AVTYP PAI WYI I D P FTYENTPES TNS SS VWTLGLLR CFLEMVQ 



305 



WO 01/53312 



PCT7US00/34263 



— §E<5 — 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, P- Phenyl alanine, G=Glycine, 
H*Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, M«Methi6nine, N»Asparagine , 
P»Proline, Q«=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLPPHIKSTVSVQIIPCQYLLQPVKHBbRfelYPQHLKSlAPSAP 
TQCRRPLPTSTNVKTLTGFGPGLAMETALRS PDRPECIRLYAPP 

TDL YGELLETC I INID VPNRARR KKS S ARKFGLQKL WE WCLGLV 
QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 
CRMCG1SAADS PSILSACLVAJME PQGSPVIMPDSVSTGSVFGRS 
TTLNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTEMliDL 
APNPNNDGADGMGIFDLLDTGDDLDPDI INI LPASPTGSPVHSP 
GSHYPHGGDAG XGQS TDRLLSTEPHE 3VPN I LQQPLALGYF VST 
AKAvjPLPU W F W S AC PQAQYQC PL FLKASLHLH VPS VQSDELLHS 

KHSHPIiDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 
FWLNQTjYKFIMNML 


S370 


1226 


716 


RWSRKLELRRAAQATBSRP PQSQEMHPPTGKEVHALKRLRDSAN 
ANDVETVQQLLEDG^PCAADDKGRTALHFASCNGNDQIVQLLL 
DHGADPNQRDGZX3NTPIiHLAACTNHVPVITTLLR3GARVDALDR 
AGRTPLHLAKS KLN I LQEGHAQ CLKAVR / HGGEADH P YAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCrVSTSLALAESLSLFRACTSLPVG 
GCISwL 


5371 




167 


IAAMLWKLLLRSQSCRLCS FRKMRS P PKYRPFIACFTCffDKQS~~ 
S KENTRTVE KL YKCS VD I RKIRR \ * KDG Y F * RMKPMLKKLRI / P 
LQELGADETAVAS I LERCP E AI VCS PTAVNTQRKLWQLVCKNEB 
ELI KLIEQFPESFFTIKDQSNQKIiNVQFFQELGLKNWI SRLLT 
AAPNVFHNPVEKNKQMVRILQESYLDVGGSEANMKVWLLKLLSQ 
KPFI LLNS PTAI KETLEFLQEQGFTS FE 3 LQLLS KLKG FLFQL C 
PRS IQNS ISFS KNAFKCTDHDLKQLVLKCPALLYYSVPVLEERM 
QGLLREG I S IAQI RETPMVLELTP Q I VQ YR I RKLNSSG YR I KDG 
HIANLW5SKKEFBANFGKIQAKKVRPLFNPVAPLNVEE 


5372 


51 


es7 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PLRLLIIJJFVTELSGAHNTTVFQGVAGQSLQVSCPYDSMKHWGR 
RKAWCRQLGEKGPCQRVVSTHNLWLLSFLRRWNGSTAJTDDTLG 
GTLT ITLRNLQPHDAGLYQ CQS LHGSE ADTLRKVLVE VLADPLD 
HRDAGDLWFPG\ DLRAS RM PMWSTAS PGAS WKEK3PS HPLPSFS 
SWPASFSSRF+QPAPSGLQPGMDRSQGHIHPVNWTVAMTQGISS 
KLCOG 


5373 


2814 


346 


VKKTKSIFNSAI4QEMBVYVENIRRKFGVFNYSPFRTPYTPNSQY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TORRI SLSDMPRSPMSTNSSVHTGSDVEQDAEKKATS SHFSASE 
ESMDFLDKSTAS PASTKTGQAGSLSGSPKPFS PQLSAPITTKTD 
KTSTTGS ILNLNLDRSKAEMDLKBLSESVQQQSTPVPLIS PXRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DS SDS E YI S DDEQKS * GTSQEDTEDKEGCQMD KEPSAVKKKP KP 
TNPVE I KEELKS TSPASEKADPGAVKDKAS PE PEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 

GRKTiIICK"E PKR P 55 PtTfinVTC K" TP PC TTVRWC ODPTDt/T .Tt> e e * r\ 

TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQR PLLPKE \ TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVS S VNGDL P I GTASADVAADIAKYTS KL\ MDAIKGTM\TEI 
YNDLSKN\mJKAQLAEDSQGLRIEIEKLQWLHQQEL\SEMKHN 
LELTMAEMRQS WEQERDRL IAEVKKQLELBKOQAVDETKXKQWC 
ANFKKEAI F YCCWNTS YCD YPCQ \ QAHWPEH\MK3 CTQSATAPQ 
\QEADAE\ VNTETLNKSS QG SS S STQS APS ETASA\SKEKETS A 
EKS KE SGSTLDLSGSRBTP SS ILLGSNQGS DHS R\ SNKS S WS SS 
DEKRGS\ TRSDHN/ TPSTQHGRS LLPGKES RAGTP FLGTSK 


5374 


2814 " 


346 


VKKTKS I FNSAMQEME VYVEINI RRKFGVFNYS PFRTPYTPNSQY 
QMLLDPTNPSAGTAKIDKQEK?m^FDMTASPKIL < NISKPVLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATS SHFSASE 
ESMDFLDKSTAS PAS TKTGQAGSLSGS PKPPSPQLS AP ITTKTD 
KTSTTGS ILNLNLDRSPCAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEKSDSEDSEXS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucle.otide 
location 
correeponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino aeiri ^pampnf mn^aininrr e{ nna i nonf4i4a 
(AsAlanine, C=Cysteine, DsAspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I»Isoleucine, K^Lysine, 
L^Leucine, W^Methionine, N=Asparagine , 
PoProline, Q»Glutamine, R=*Arginine, 
S»Serine, ToThreonine, V= Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown , *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNP VEIKEELKSTS PASBKADPGAVKDKAS PEPSKDPSGKAKPS 
PHP I KDKLKGKDETDS PTVHLGLDSDSE\NELVI DLGEDHSGRE 
GRKNKKEPKEPSPKQDVVGKTPPSTTVG3HSPPBTPVLTRSSAQ 
TS AAGATATTSTSSTVTVTAPAPAATGS P VKKQRPLI»P KE \TAP 
AVQRSOGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKli\MDAIKGTM\TEI 
YNDLSKN\TTWKAQ1AEDSQGLRIEIEKLQWI*HQQEIj\SEMKHN 
LELTMAEMRQSWBQERDRLIAEVKKCLEXEKQQAVDETKKKQMC 
ANF KKE AI P YCCWNTS Y CD YP CQ\QAHWPBH \ MKSCTQSATAP Q 
\QEADAE\VNTETLNKSSQGSS SSTQSAPSETASA\S KEKETSA 
EKS KESGSTLDLSGSRET PSS I LLGSNQGSDHSR \9N KS S WSSS 
DBKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5375 


2907 


1X16 


HI FLAEEEPMLERRCRGPIiAMG PAQPRLLSGPSQESPQVLGKE^S 
RGLRQQGTSVA\QSGAQAPGRAHRCAHCRRHPPGWVA\LWI>HTR 
RCQA/RGLPI.PCPECGRRFRHAPFLALHRQVKAAATPDWGFACH 
LCGQS FRG WVAI> VLH LRAHS AAKAG P ?ACP KMARDA F WR R KAAS 
«?C3 1 JjKKUn faK FKOFK tr c ICvjN \*&Ro I LiFTWDQ / I»KVAH KRVHV 
SRRP*ERGPPAKVFWGPRPRGPPTGDTPPGPGGDAVDRPF\QCA 
CCGKR FRH K\ PNL IRSHAAC rSGER PHQ / CS R E CG \ KRFTNKP Y 
LTS\HRRITHTARQPYPCKECGRRFRHKPNLLSHSKIHKRSEGS 
AQAAPGPGSPQLPAGPQESAAEPTPAVPLKPAQEPPPGAPPEHP 
QDP I EAP PSL YS CDDCGRS FRLERFLRAHQRQHTGER P FTCAEC 
GKNFGKXTHLVAHSRVHSGERPFRLARKCGRRFLPRASOSGGRN 

saepnaprfgp f vcpdcgkafrhkp ylaahr p i atpaekp yvcp 
dcukapsqksnlWshrrihtgerpyacpdcdrsfsqksnlith 

RKSHI RDGAFCCAI CGQTFDDEERLLAHQKKHDV 


5376 


4504 


591 


VSTFSLCLWPAGGGGRGRVSNMAQSKRHVYSRTPSGSRMSAEAS ' 
ARP LRVGSRVE V 1 G KGHRGT VAY VG ATLFATGKW VGV I LDE AKG 
KNDGTVQGRKYFTCDEGHG I FVRQSQ IQVFEDGADTTS PETPDS 
SAS KVLKREGTDTTAKTS KLRGLKP KXAPTARKTTTRRP KPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAPI I PTP 
VLTSPGAVPPLPSPSKEEEGLRAQ\niDIJ3EKLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKSKMQEQQADIX}RRIjKEAj^KBAiCEAL 
EAKBRYMEEMADTADAIEMATLDKEI^ERAESLQQEVEALKER 
VDEIiTTDLEILKAEIEEKGSDGAASSYQLKQLEEQNARLKDALV 
RMRDIjSSSEKQEHVK\LQKLMEKKNQELEVVRQQRERLQEELSQ 
AESTIDELKEQVDAAI^AEEMVEMLTDRNLNLEEKVRELRETVG 
DLBAMNEMNDELQENAR ETEI.ELREQL DMAGARVRE AQ KRVEAA 
QETVAD YQQT I KKYRQLTAHLQDVNRELTNQQBAS VERQQQP P P 
ETFDFKI KFABTKAHAKAIEMEr.RQMEV7AjO^RHMSIiLTAFMPD 
SFLRPGG DHD CVIiVLLLMPRLI CKAEL I RKQ AQ E K FEL S ENC S E 
RPGLRGAAGEQLS FAAIGLVY\SLMPAAGHRYHRY* CHALSQCR 
LD\VYKKVGSLYPEMSAHERSbDFLIELLHKDQLDETVNVEPLT 
KAIKYYQHLYS IHLAEQPEDCXMQLADHIKFTQSALDCMS VBVG 
RLRAFLQGGQBATDIALLLRDLETSCS \ DIRQFCKKIRRRMPGT 
DAPGI PAALAPGPQVSDTLLDCRKHLTWWAVLQEVAAAAAQIiX 
APLAENEG LLVAALE EUVFKAS EQIYGTPSSSPYECLRQS CNIL 
ISTMMK\LVTAMQEGEYDAERPPSKPPP\VELRAAALRAEITDA 
EGI/StiKLEDRBTV I KELKKSIiKIKGEELSEANVRLT LLEKKLDS 
AAKDADER IE KVQTRJkE ETQALLRKKE KEFBETMDALQADI DQL 
EAEKAEL KQRLNSQS KRT I EGLRGPPPSGIATLVSG I AGE EQQR 
GAIPGQAPGSVPGPGLVKDSPLLLQQISAMRLHISQLQHENSIL 
KGAQMKASLASLPPLHVAKLSHEGPGSELPAGALYRKTSQLLET 
IiNQLSTHTHWD ITRTS PAAKS PSAQLMEQ VAQLKS LSDTVEKL 
KDEVLKETVSQRPGATVPTDFATFPSSAFLRAKEEQQDDTVYMG 
KVTFSCAAGFGQRHRLVLTQEQIiHQLHSRLIS 


5377 


762 


1106 


DVPCKRVLPAEAQEKGQLTLS OGESGEEG\F * YHEVRQAEGES * 
/MFGPNVRI.VHTQLKTKKPSGTLECAKFYLHTGSTKFAARISCTX 
S S * WPG YDGWWGGQ Y I FI FRGMRWEEQP 


5373 


2009 


664 


Q ASGTTLRPLPDLP QLKRREATS RNRALKPRGRLVLMTSCLP ftl» 
RFIATPRLSAMPHIDNDVKLDFKDVLLRPKRSTLKSRSEVDLTR 
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SEQ 
IC 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anu.no acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=>Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V»Valine, 
WnTryptophan, Y*Tyrosine, X=Onknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








sfsfrnskqtysgvpiiaanmdtvgtfbmakVlcks*vpgsfwd 
vpqmgcvfliyklftliotkmlllsvllpas ilvaekfslftavh 

KHYSLVQWQEFAGQNPDCLBHLAASSGTGSSDFEQLEQI LEAI P 
QVKYICLD VANGYSEHFVEP VXD VRKRFPQHTI MAGNWTGEMV 
BEL I L SGAD 1 1 KVG 1 G PGS VCTTRKKTGVG YPQLSAVME CADAA 
HGLKGHI1SDGGCSCPGDVAKAFGAGADFVMLGGMLAGH6ESGG 
EL Z ERDG KKYKLF YGMS S * I \ AM \ KKYAGGVAE YRASEG KTVKV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLK£LSRRTTFIRVTQQ 
VNPIFSEAC 


537$ 


2009 


664 


QASGTTLR PLPDLPQLKRREATS RNRALKPRGR LVLMTS CLPAL 
RFIATPRLSAMPHIDNDVKLDFKDVLLRPKRSTLKSRSEVDLTR 
S FS FRNS KQTYSGVP 1 1 AANMDT VGTFBMAKVLCKS * VPGSFWD 
VPQMGCVFL I YKL FTLKWKMLLLS VLL PAS I LVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSD7EQLEQILSAIP 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTrMAGNVVTGEMV 
EELILSGADI I KVG IG PGSVCTTR KKTGVG YPQLSAVMECADAA 
HGLKGHI I SDGGCS CPGDVAKAFGAGADFVMLGGMLAGHS SSGG 
EL IERDG KKYKLF YGMS S * I \ AM\KK YAGG VAEYRASEGKTVE V 
P? KGDVEHTIRDILGGI RSTCT Y VGAAKLKBLSRRTTFI R VTQQ 
VNPIFSEAC 


5380 


2 


2050 


PSRAGGAERGRAAAARS PGG5AAGWECPSVLDEAGACTMSSCVS 
SOPS SNRAAPQDELGGRGSSSS ESQKPCEAhRGLSS LSI HLGME 
SPlVVTECEPGCAVDLGltARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVSITGMQDCVQLNQYTLKDEIGKGSYGWKLA 
YNENDNTYYAMKVLSKKKIiIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI \EQVYQEIA\ ILKKLDHPNWS KLVEVL\DDPNEDHLYMV 
F\ELVNQGPVMEVPTLKPLSEDQAR?YFQDLIKGIEYLHYQKI I 

hXrdikpsnllvgedghikiadfgvsnefkgsdallsntvgtpa 
fmapeslsetrki fsgkaldvwamgvtlycfvfg*cp fmderim 

CLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIWPEI 
KLHPWVTRHGASPLPSEDSNCTLVEVTEEEVENSVKHI PSLATV 
ILVXTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELKT*K3SPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
*PEPPRTDBALCPxETGRTCWAPLLQVLWWVOTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM 


S3 81 


2 


2050 


PSRAGGAE RGRAAAARS PGGSAAG WECPS VLDEACjACTI^SS CVS ~ 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
SP IWTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVSITGMQDCVQLNQYTLKDE1GKGSYGWKLA 
YKEXTONTYYAMKVLS KKXLI RQAAFPRRPP PRGTRPAPGG C I QP 
RG P I \ EQVYQE IA\ I LKKLDH ENW\ KLVE VL \D DPNEDHL YMV 
F\ELVNQGP VMS VPTLKPLSEDQAR F YFQDLIKG I E YLH YQKI I 
H\RD I KPSNLLVGEDGH I KI ADFGVSNE FKGSDALLSNTVGTPA 
FMAPESLSETRKI FSGKALDVWAMGVTLYCFVFG * CPFMDERIM 
CLHSKIKSQALEFPDQPDIAEDLKDLrTRMLDKNPESRIWPEI 
KLHPW VTRHG AE PLPSEDENCTLVEVTEEEVENS VKH I PS LATV 
ILVKTMIRKRS FGN P FEGS RREERS LS APGNLLTKKPTRE CESL 
SELKT*KISPLPACCKVT*BFPHPSGCRPSCWQPPFLHTHSQPR 
♦PEPPRTDEALCPYBTGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM . 




153 6 


203 ; 


GARGSQQDAP ALQBAE VRGP ERAQ PARGRMTKARL FRLWLVLGS 
VFMILLIIVYWDSAGAAHFYLHTSFSRPHTGPPLPTPGPDRDRE 
LTADSDVDBFLDKFLSAGVKQSDLPRKETEQPPAPGSMEESVRG 
YDWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DI PNSELSHL I VDDRHGAI YCYVPKVACTNW KRVM I VLSGSLLH 
RGAPYRDPLRI PREHVKNASAHLTFNKFWRRYGKLSRHLMKVKL 
KKYTKFLFVRDP FVRLI S AFRS KFELEMEEF/ * PQVRRAHAAAV 
RQPHQPARLGARGLPRWPQ \ VS FANF I QYLLDPHT3KLAP FNEH 
WRQVYRLCHPCQIDYDFVGKLETLDEDAAQLLQLLQVDLAAPLP 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalar.ine, G=Glycine, 
H=Histidine, i=lsoleucine, K= Lysine, 
L=Leucine, M=Methionine, N*Asparagine , 
P»Proline, Q«Glutamine, R=Arginine, 
S»Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 










PELPGTGPPSSWBEDWFAKIPLAl^RQQLYXLYSADFVLFGypKP"" 
ENLLRD 




5383 


45 


5250 


VSRLLGCRNS KRTWRML I S KNMP WRRLQG 1 S FGMYSAEELKKLS 
VKSITNPRYLDSLGNPSA^GLYDLALGPADSKEVCSTCVQDFSN 
C5GHLGHIEliPLrVYNPLLFDKLYLLLRGSCLNCHMLTCPRAVI 
KLLLCQLRVLE VGAIiQAV YELBRI LSRFLE BNADPSASE I REEL 
EQYTTEI VQNNLLGSQGAHVKNVCES KS KL I ALFWKAHMNAKRC 
PHCKTGRSWRKEHNSKLTITFPAMVHRTAGQKDSEPLGIEEAQ 
IGKRGYLTPTSAREKLS^U^KNEGFFLNYLFSGMDDDGMESRFN 
PS VFFLDFL WP PS RS R P VSRLGDQM FTNGQTVNLQAVMKDWL 
IRIOjLALMAQEQKLPEEVATPTTDEEKDSLIAIDRSFLSTLPGQ 
SLIDKLYWIWIRLQSHVNIVFDSEMDKLMMDKYPGIRQILBKKE 
GLFRKHMMGKRVDYAARSVICPDMYINTNSIGI PMVFATKLTYP 
QPVTPWNVQELRQAVIWGPNVHPGASMVINEDGSRTALSAVDMT 
QREAVAKQLLTPATGAPKPQGTKIVGRHVKKGDILLLNRQPTLH 
RPSIOAHRARILPBEKVLRUJYANCICAyNADFDGDEMNAHFPQS 
ELGRAEAYVLACTDQQ YLVPKDGQ PLAGL 1QDHM VS<3ASMTTRG 
CPFTREHYMELVYRGLTD KVGR VXLLS PS ILJC PFPLWTGKQ WS 
TLLINI I PEDHI PLNLSGKAKITGKAWVKETPRSVPGFNPDSMC 
ESQVI I REGELLCGVLDKAHYGS S AYGLVHCCYE I YGGETSGKV 
LTCLARLFTAYLQLYRGFTXjGVED 1LVKPKADVKRQR I IEESTH 
CGPQAVRAALNLPEAAS YD3VRGKWQDAHLGKDQRDFNMI DLKF 
KEBVNHYSNE INKACMP FGLHRQFPENTLQLMVQSGAKGSTVNT 
MQISCT»t»GQI ELEGRST PLMASGJCSLPCPEPYEFTPRAGGFVTG 
RFLTGIKPPEFFFHCMAGREGLVDTAVKTSRSGYLQRCI IKHLE 
GLWQYDLTVRDSDGSWQPLYGEDGLDIPKTQFLQPKQFPFLA 
SNYB V IMKSQHLHEVL SRAD PKKALHH FRAI KKWQSKHPNTLIiR 
RGAFLS YSQKIQEAVKALKLBSENRNGR/RPWDS /G/RMLRMWY 
ELDEE SRRKYQKKAAAC PDPSLS VWR PDI Y FASVS ETFETKVDD 
YSQEWAAQTE KS YEKSELS LDRLRTLLQL\KWQRS LCEPGEAVG 
LLAAQS I GEPS TQMTLNTFHFAGRGEMNVTLG I PRLRE 1 LMVAS 
AIWKTPMMSvyVLNTKKALKRVK^LKIOQLTRVCLGEVLQKIDVQ 
ESFCMEEKQNKFQVYQLRFOFLPHAYYQQBKCLRPEDILRFMET 
RFFKLLMES IKKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 
EQEGDEE EEGH I VDAEAEEGDADASDAKRKEKQE EEVD YESEE S 
EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 
PSRPPDAAPETHPQ PGAPGA\EAMER RVQAVRE I HPFI DDYQ YD 
TEESLWCQVTVKLPLMKINFDMSSLVVS LAHGAVI YATKG I TRC 
LLNETTNNKNEKELVLNTEGINLPELFKYAEVLDLRRLYSNDXH 
Al ANT YG I EAALR V I E KE I KDVFAVYG IAVD PRHLS L VAD YM CF 
EGVYKPLNRFGI RSNS S PLQQMTFETS FQFLKQATMLGSHDELR 
S PSACLWGKWRGGTSLFELKQPLR 






53B4 


196 


886 


qsggqrlptvl*l*gppgscpcilslf\pgrphalpeirpyini 
tilkgdkgdpgpmglpgymgregpqgepgpqgskgdkgemgspg 
apcqkrffafsvgrktalesgedfqtllfervfvnldgcfdmat 
gqfaaplrgiyffslnvhswioyketyvhimhnqkbavilyaqps 
ersimqsqsvmldlaygdrvwvrlfkrqrbnaiysndfdtyitf 
sghlikaedd 




"5385— 


326 


799 


LMVPRTKKEAPAP P KAEAKAKAIi \ KAKKAVL KD VKSHKKNKI HM 

sptprrpktl*lrrqpkypwkstprrnkldhhviikfpltte*a 

VXKI KrWSLLVFTVDVKANKHQIKQAVKK/LCDIDVAKVNTL JQ 
SDGERKAYVRLAPDYDALWATKIGIT 


5386 


326 


799 


LMVPRTKJCEAPAPPKAEAKAKAL\KAKKAVLiCDVHSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VKK I ENN SLLVFTVDVKANKHQ I KQAVKK/LCDIDVAKVNTL IQ 
SDGERKAYVRLAPDYDALWATKIGIT 




5387 


2 


2117 


FWAASGGCWFVI4GERRAGSLLSASYGTFAMPGMVLFGRRWAIA " 
SDDLVFPGFFELWRVLWWIGILTLYLMHRGKLDCAGGALLSSY 
LIVLMILLAWICTVSAIMCVSMRGTICNPGPRKSMSKLLYIRL 
AL FF PEM VWAS LGAAWVADGVQ CD RTWNG I IATWVS W I r IAA 
TWSIIIVFDPLGGKMAPYSSAGPSHLDSHDSSQLLKGLKTAAT 
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SEQ 
ID 

NO: 


Precicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, 5- 
Glutamic Acid, F« Phenyl alanine, G-Glycine, 
H»Histidine, I-Isoleucine, K^Lysine, 
L^Leucine, M=Methionine, tf=Asparagine , 
P»Proline, Q*»Glut amine, RaAxginine, 
S=Serine, ToThreonine, V= Valine, 
W=Trypcophan, Y=Tyrosine, X=Un known, *«stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


SVWETRIKLLCCCIGKDDHTRVAFSSrAELPSTYFSDTDfcVPSD 
IAAGLALLHQQQDNIRKNQSPAQWCHAPOSSQEADLDABIiKNC 
HHYMQFAAAAYGWPLYIYRNPLTGIjCRIGGDCCRSKNPQTMT/M 
VGGDQIjQL/ CTSAPILHTHRAAVQGLHPRQIiPWTRFTELPFLVA 
LDHR KZSVWAVRGTMSLQDVhTDbSAES B VLD VECEVQDRLAH 
KGISQAARYVYQRLINDGILSQAFS iape yrlvi vghslgggaa 
ALLATM VRAA YPQ VR CYAFS PPRGLWSKALQEYSQS PI VS LVLG 
KDVI PRLS VTNLEDL KRR I LRWAHCNKPKYKI LLHGLWYELFG 
GNPNNLPTELDGGDQEVLTQPLLGEQSLLTRWSPAYSPSSDSPL 
DSS PKYPPI*Y PPGR I IHfcQEEGASGRFGCCSAAH YSAKWSHEAB 
PSKI L XGPKMLTDHMPDI LMRALDSWSDRAACVSCPAQGVS S V 
DVA 


5388 


15*9 


753 


TADGGAGGGGRRQAGVRRHYLYPP^VkRRRAACQAERPAARS 
KDTDIAAYQKGNLGVQLRNM AQE TNHSQVPMLCS TGCGF YGNPR 
TNGMCS VCYKEHLQRQNS SNGRI S P PVQCTDGS VPEAQ SALDS T 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETBDVQASVS 
DTAQQPSEEQS KS LB\NRNKKRIAVSCAGRKWDLI*GLNAGVEMF 
TWYTVTQM YT I ALTITKQMLKNFVFQQE FKS FGS FHQQLLE YK 
ILEHLQTKN & 


5389 


1569 


753 


TAfrGGAGGGGkUGAGVRRiJ yLyp ftgg VrrPvRaa4qaerpaa5s 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCG FYGNPR 
TNGMCSVCYKEHLQRQNSSNGRISP PVQCTDGS VPEAQSAIiDST 
SSSMQPSPVSNQSLL3ESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQPSEEQS KSLE\NRNXKRIAVS CAGRKWDKLGLNAGVEM F 
TWYTVTQMYT IALTITKQMLKNFVFQQEFKSFGS FHQQLLEYK 
ILEHLQTKN 


5390 


217 


1332 


E D PR KLMBDKMWS ECEG PEMS LVCLTD FQAHAREQLS KSTRDF t 
EGGADDSITRDDNIAAFKRIRLRPRYLRDVSEVDTRTTIQGEEI 
SAPI CI APTGPHCLVWPDGEMSTARAAQAA\GI C YI TSTPASCS 
LEDI VIAAPEGLRWFQLYVTIPDLQIiNKQLlQRVESLGFKALVIT 
LDTP VCGNRRHDI RNQLRRNLTLTDLQS PKKGNAI P YFQMTP I S 
T SLCWNDLSWFQS I TRLP 1 1 L KGI LTKEDAEIAVKHNVQG 1 1 VS 
NHGGRQLDEVLASIDAIiTEWAAVKGKIEVYLDGGVRTGNDVLK 
ALALG AKC I FLGDA I LW AUAS KG EHG VKE VLNI LTNE FHTS MA \ 
LTCCRSVAB INRNLVQFSRL 


5391 


1 


1292 


VKKMGRSRGPPTAGGQRCEEAPGTVMERRLGVRAt^VKENHGSF 
Q PP VCNKLMHQE QL KVM FVGG PNTRKD YH IEEGE EVFYQLEGDM 
VLRVLEQGKHRDWIRQGE IFIiLPARVPHSPQRFANTVGI*WER 
RRLETELDGLRYYVGDTMDVLFEKWF YCKDIjGTQLAPI IQEFFS 
SEQYRTGKPIPDQIiLKBPPFPLSTRSIMEPMSLDAWLDSHHREI* 
QAGTPLSLFGDTYETQVIAYGQGSSEGLRC3NVDW7LWQLEGSSV 
VTMGGRRLSLG P WMDSLLVLS WGPS Y \ AW \ERTQGS VALS VT\Q 
DPACKKSPWGEPSCHGLKAATGVPSTLEVPSLPNNSPSPHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQTQPTAL 
PVUPG3LPPAPLLP I PLSLQTQCSTSTPRRPSIJKAS 


5392 


1 


1623 


IRGSNAQKWGASGSGGAGPQPDPAGPGGVPALAAAVIjGACEPR" 

CAAPCPLPALSRCRGAGSRGSRGGRGAAGSGDAAAAAEWIRKGS 

F IHKPAHGWLHPDAR VLGPGVS YVVR YMGCIEVLRSMRS LDFNT 

Rl^VTREAIlTOLHEAVPGVRGSWKKKAPNKALASVLGfCSNLRFA 

GMSIS I H ISTDG LS L SVPATRQVI ANHHM PS ISFASGGDTDMTD 

Y VAYVAKDP I NQRACH I LE C CEGL\ AQS 1 1 STVGQAFELRFKQ Y 

UiSPPKVALPPERIAGPEESAWGDBEDSLEHNYYNSIPGKEPPL 

GGLVDSRLAIiTQPCALTALDQGPSPSLRDACSLPWDVGSTGTAP 

PGDGYVQADARGPPDHBEHLYVNTQGLDAPEPBDSPKKDLFDMR 

PFEQALKLHECSVAAGVTAAPLPLEDQWPSPPTRRAPVAPTEEQ 

LRQBPWYHGRMSRPJU\ERMLRADGDFXVRDSVTNPGQYVI#TGMH 

AGQPKHLI»LVDPBGVVRTKDVLFESISI{LIDHHLQNGQPIVAAE 

SELHLRGWSREP 


5393 


2 


982 


GGDSAGMTMETQMSG^CPRNLWUjQPLTVIJjIJASADSOAAAP 
PKAVLKLEPPWINVLQXEDSVTLTCQGAPQP/ERSDSIQMFHNG 
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SEQ 
ID 
RO: 


Predicted 
beginning 
nucleotide 
location 
correoponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seamen t rnnhalninrr n J _ _ _ H J _ 
(A^Alanine, C=Cysteine, D=A 3 partic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Hietidine, I»Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N»Asparagine , 
P» Proline, Q=Glut amine, R=Arginine, 
S»Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\opoS3ible nucleotide insertion) 








\nlipthtqps\yrfkannn\dsgeytoqo , gotsl\sdpvhltv 

LSEWLVLQTPHLEFQEGETIMLRCHS \ WRDK? \LVKVTFFQNGK 
SQKFSHLDPTPS I PQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSSPMGXIVAWIAXAVAAIVAAWALIYCRKKRISAN 
STDP VKAAQ FEP PGRQM I AI RXRQ LE ETNND YETADGGYMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


"-3394 


2 


982 


GGDSAGMTM B TQ ^QNVCPRNLW&LQPLT VLLkliASADS QAAAP 
PKAVLKLEPPWINVLOXEDS VTLTCQGAPQP /ERSDSIQWFHKG 
\NLIPTHTQPS\YRPKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVU2TPHLEFQBGETIMLRCHS\WRDKP\liVKVTFFQNGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGNIGYTLFSSKPVTI TV 
QVPSMGSSS PMGI I VAWI ATAVAAIVAAWAJLI YCRKKR ISAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNIYLTLPPNDHVNSNN 


53 9S 


313S 


531 


RASDAKNQEGLLNTRRKSTDS VP I SKSTLSRS LS^QASbFDGAS 
S S GN PEAVALAPDAYSTGSS S AS STLKRTKKPRP PSLKKXQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAG P KAACPLDS ES VEG WPP AS GGGRVQNS PP VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRjEFDYSEDKS 
SWDNQQENPPPTKK1GKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDIPIAJCGTYTFDIDKWDDPNFNPFSSTSKMQBSPKL 
PQQS YNFDPDTCDE S VD P FKTSS KTPS S PSKS PAS FBI PASAME 
ANGVDGDGIiNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPP VI SA WKATD E EKLAVTNQKWTCMTVDIiEADKQD 

DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEA I E X TAPEGS FASABALLS RLAHP VSLCGALI) YLEPDLAEKN 
PPLFAQKLQRBAAHPTDVSISKTALYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEI ITKEREVSEWKDKYEESRREVMEMRKI VAE 
YEKTIAQMIBDEQREKSVS\HQTVQQLVI*EKEQA\LADIjNSVEK 
\ SLADLFRRYBKMKEVLEGFRKNEBVLKRCAQEYLSRVKKEEQR 
YQALKVHA\BEKLDRANAE \ IAQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKEI EELTK1CDE1.I AKMGKS 


5395 


3135 


531 


RASPAXNQEGI,IJarRRKSTDSVPl5KSTIiSRSr,SLQASDFZX3AS 
S SGNPEAVALAPDAYSTGS S SAS STLKRTKKPRPPSLKKKQTTK 
XPTBTPPVKETQQBPDEESLVPSGENLASETKTBSAKTEGPSPA 
LLEBTPLBPAAGPKAACPLDSE3VEGWPPASGGGRVQUSPPVG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNC^ENPPPTKKIGKKPVAKMPIjRRPKMKKTPEKLDNTPAS? 
PRSPAEPNDIPIAKGTYTFDI DKWDDPNFNPFSSTS KMQES PKL 
PQQS YNFDPDTCDESVDPFKTSS KTPSSPSKSPASFEI PASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRS PLSDPPSQD? 
TPAATPETPPVI S AVVHATDEB K1AVTNQKWTCMTVDLEADKQD 
YPQFSDLSTFWETKFSSPTEBLDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQBSPVKSSPVRMSESPTPCSGSSFBETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAIBITAPEGS FAS ADALLS RLAH PVSLCGALD Y L E PDLAEKN 
PPLFAQKLQREAAHPTDVSISXTALYSR IGTAEVEKPAGLLFQQ 
PDLDSALQIARAEIITKERBVSEWKDKYEESRREVMEMRKIVAE 
YE KTI AQM I EDEQRE KSVS \HQTVQQLVLEKEQA\ LADLNS VEK 
\SLADLFRRYEKMiG3VLEGFRKNEBVLKRCAQEYLSRVKKEEQR 
YQALKVHA\BEKIiDRANAE\ IAQVRGKAQQ3QAAHQASLAERSS 
CRV\DALERTLEQKNKEIEELTKICDEI*IAKMGKS 


5397 


3135 


531 


KASUAKNQEGLLNTfeRKS^rDSVPI SKSTLSRSIjSLQASDFDGAS ' 

S SGNPEAVALAPDAYSTGSSSAS STLKRTKXPRPPSLKKKQTTK 

KPTB^PPVKETQQEPDEESLVPSGENIASBTKTESAKTBGPSPA 

LLEBT PLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 

RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 

SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTFASP 

PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
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Amino acid segment containing signal peptide 
(A«Alanine, C*Cys teine, D=Aapartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, GoGlycine, 
HeHistidine, Ielsoleucine, K«Lysine, 
LsLeucine, M«Methionine, N=Asparagine, 
PaFroline, Q=Glutamine, ReArginine, 
SaSerine, T»Threonine, V« Valine, 
^Tryptophan, Y-Tyrosine, X^Unknovn, **Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSyKFDPDTCD^SVDP^^SK^PSSPSKSPASraiPASAME 
ANGVDGDGLNKPAKKKKTPLKTDT FRVKKSPKRS PLSDPPSQDP 
T P AAT P ETP PV I S AWHATD E E KLAVTNQKWTCMT VDLEADKQD 
ypQPSDLSTFVNETKFSSPTEBIiDYRNSYBIEYMBKrGSSLPQD 
D0AP KKQALYLMFDTSQE S P VKSSPVRMS BSPT P CSGS S FEETE 
ALVNTAAKNOHP VP RGLA PNQESHLQVP E KSSQ KELEAMGLGTP 
S EAJ 6 ITAPEGS FAS ADALL S RLAHPVSLCGALD YLEPDLAEXN 
PPLFAQKLQREAAHPTDVS ISKTALYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEI ITKEREVSEWKDKYEESRREVMEMRKIVAE 
Y BKT IAQM I EDEQREKS VS \HQTVQQLVLEKEQA\I*ADLNSVEK 
\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQBYLSRVKKEEQR 
YQALKVHA\ SEKLDRANAE \ IAQVRGKAQ QEQAAHQ AS LAERS S 
CRV\DALERTLEQXNKEIEELTKICDELIAI^JGKS 


5398 


56 


5426 


SGEVCRMESNFNQEGVPRPSYVFSADPjARPSElNFDGIKIiDLS" 
HE FSLVAPNTEANSFES KDYLQ VCLR I R P FTQSE KELESEGCVH 
II»DS0TVVLKEPQ CILGRLS EKS SG\QM \AQKFS FFPGFLG PAT 
TQKEFFQGCIMHP\VKDLLKGQSRIiIFTYGLTNSGKTYTFQGT3 
ENlRILPRTLNVLFDSLiQERLYTKMNLKPHRSRBYLRLSSEQEK 
EBI ASKS ALLRQI KEVTVHNDSDDTLYGSLTNSLNISEFEES I K 
DYEQANLKMANS IXFSVWVSFFEIYNBYI YDLFVPVSSKFQKRK 
MIiRLSQDVKGYSFIKDLQWIQVSDSKBAYRLLKLGIKHQSVAFT 
KLNNASSRSHSIPTVKILQIEDSEMSRVIRVSELSLCDLAGSER 
TMKTQNEGEREjRETGNINTSLLTLGKCINVLKNSEKS kfqqhvp 
FRESKLTHYF/ QSFFNGKGKI CMI VNI SQCYLAYDETLNVLKFS 
AIAQKVCVPDTLNSSQEKLFGPVKSSQBVSLDSNSNSKILNVKR 
ATrSWENSLEDLMEDEDLVBELENAEETED/VGETKLLDEDLDK 
TLEENKAFISHEEKRKLLDLIEDLKKKLINSKKEKLTLEFKIRE 
EVTQEFTQYWAQREADFKETLLQERElLEBNAERRbAIFKDLVG 
KCDTRE EAAKDI CATKVETEEATACLELKFNQIKAELAKTKG EL 
I KTKEELKKREKESDSLIQELETSNKKI ITQNQRI KELINIXDQ 
KEDTlKEFQNLKSHMENTFi<CNDKADTSSLlINNKLICNETVEV 
P KDSKSKI CS3RKRVNENBLQQDE? PAKKG S I MVS SAITEDQ KK 
SEEVRPNIAEIEDIRVLQENNEGLRAFLIiTIENBIiKNEKEEKAE 
LNKQIVHFQQSLSLSEKKNLTLS KEVQQ IQSNYDIAI AELHVQK 
SKNQEQEE KIMKLSNEI ETATRS ITNNVSQIKLMHTKIDELRTL 
DS VSQISN IDLLNLRDLSNGS EBDNLPNTQ LDLLGND YLVSKQ V 
KEYRIQEPNRENSFHS S IEAI WEECKE I VKASSKKSHOIEELEQ 
QIBXI^AEVXGYKDENNRLKE KEHKNQDDLLKEKETL 1 QQLKBE 
LQEKNVTLDVQIQHVVEGKRALSELTQGVTCYKAKIKELETILE 
TQKVERS H S AKLEQDI LEKES I ILKLBRNLKBFQEHLQDS VKNT 
KDLNVKELKLKEEITQLTNNLQDMKHLLQLKEEEEETNRQETEK 
LKE ELS AS5ARTQN\LNADLQRKE ED YADLKEKLTDAKKQ I KQV 
QKEVSVMRDEDKLLRIKINELEKKKN0CSQSLDMKQR\TICX3LK 
EQL INQKVEEAI QQ YERACKDLNVKE K I IEDMRMTLEEQEQTQV 
EQDQVL \ EAKL5EVERLATBLDRWR VKCNDLETKNNQRS NKEHE 
NNTDVLGKLTNLQDBLQESEQKYNADRKKWLEEKMMLITQAKEA 
ENIRNKEMKKYAEDRERFFKQQNEMEILTAQLTEECDSDLQKWRE 
BRDQLVAALEIQLKALISSNVQKDNEIEQLKRIISETSKIETXJI 
MD I KPKRI S S ADPDKLQTB P LSTS FE ISRNKI EDGS WLDS CEV 
STEtraQSTRFPKPELEIQFTPLQPNKMAVKHPGCTTPVTVXlPK 
ARKRKSNEMKEDL VXCENKXNATPRTNL KF Pi S DDRNS S VKKEQ 
KVAIRPSSKKTYSLRSQAS IIGVNLATKKKEGTLQKFGDFLQHS 
PS I LQ S KAKKI I ETMSS S KLSNVHASKENVSQ PKRAKRKL YTS B 
ISSPID1SGQVILMDQKMKESDHQIIKRRLRTKTAK 


5399 


705 


230 


^PRMAKFLSQDQINEYKECFSLYDKQQRGKIKATDLMVAMRCLG 
ASP T PGEVQRHLQTHGI DGNGELDFSTFLTIMHMQI KQEDPKKE 
ILLAMLMVDKEKKG YVMASDLRS KLTS LGEKLTHKEV \ DDL FRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY 


5400 


931 ■ " 


248 


SHCSSGME I PPTN YPAS RAALVAQN Y INYQQGTPHRVFEVQKVK 
QASMEDIPGROHXYRLKFAVBE I 1 QKQVXVNCTAE VLYPS TGQE 
TAP EVNFTFEGETGKNPDEEDNTFYQRLKSMKEPLEAQNI\PDN 
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Amino acid segment containing signal peptide" 
(A*Alanine, C=Cyst eine, D=Aspartic Acid, E= 
Glutamic Acid, FaPhenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K»Lysine, 
I»=*Leucine, M«Methionine, N=Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
Sa Serine, "^Threonine, V=Valine, 
W«Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








FGNVS P EMTLVLHLAWVACG Y I IWQNSTEDTWYKMVKlQTVKQV 

QRNDDFIELDYTILLHNIASQ2IIPWQMQVLWHPQYGTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TGWSYGPTT^LAPlAPRDFPPPPKLLIHPQAVVRLSCGAGSMGS 
QAAAEWRNWASWEGSSSLSOCSMGCFKDDRIVFWTWMFSTYFMB 

kwaprqddmdfyvrrklaysgsesgadgrkaaepevevevyrrd 
smjpglgdpdidweesvclnlilqkldymvtcavctradggdi 
hihkkksqqvfaspskhpmdskgebskisypniffm:dsf\be\ 

VFSI^TVGKGEMVCVELVASDKTNTFQGVIFQGSIRYBALKKVY 
D10RVSVAARMAQK\MSFGFSKYSNMEF\VR\MKGPQGKGHABMA 
VSRVSTGDTS POGTE ED SS PAS PMHERVTS FSTP PTPE RNNRPA 
PFSPSLKRKVPRNRIAEMKKSHSANDSEEFFRSDDGGADLHNAT 
NLR3RSI^GTGR5LVGSWLKLNRADGNFLLYAHLTYVTl»PIiHRI 
LTDILEVRQKPILMT 


! 5402 


3445 


1563 


GEC FIMAA VVQQNDLVFE FASNVM 2DERQLGDPAI FPAV I VEHV 
PGADILNS YAGLAC VEEPMDMITESSliDVAEEE I IDDDDDDITL 
TVEASCHDGDETIETIEAAEALLNMDSPGPMLDEKR1NNNIFSS 
PEDDMVVAPVTHVSVTLDGIPEVMETQQVQEKYADSPGASSPEQ 
PKRKKGRKTKPPRPDSPATTPNI SVKKKNKDGKGNTI YLWBFLL 

MNYEPMGRALRYYYQRGIIiAKVEGQRLVYQFKEMPKDLIYINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTSTMQDETLNSSVQSIR\TIQAPTQVPVWSP 
RNQQNLHTVTLQrvPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
>HVLKENVMIiQSQKAGSPPSIVLGPARV\QQVLTSNVQTICNGT 
VSV\ASSPSFS \ AT AP WTLFLLGSSQLVAHPPGTVITSVI ECTQ 
ETKTLTQEVEKKESEDHLKENTBKTEQQPQPYVMWSSSNGFTS 
QVAMKQNELLEPNSF 


5403 


3445 


1563 


GECFI MAAWQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADILNSYAGLACVEBPNDMITESSLDVAEEEIIDDDDDDITL 
T VEAS CHDGDBT I ETIBAAEALLKMDS PGPMLDEKRINNNI FSS 
PEDDM WAPVTHVS VTLDG I PE VMETQQ VQB KYAD S PGASS P EQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 
ALLODKATCP K v I KWTOREKG T FKTAm ^ KPVQP T.W3 Kwrr* v zt \ r» 

MNYEPMGRALRYTYQRGIIiAKVEGQKLVYQFKEMPXDLlYINDE 
DPSSSIBSSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNS KAAKPKDPVE VAQPS EVLRTVQPTQS PYPTQLFRTVHWQ 
PVQAVPEGEAARTS TMQDETLNSSVQS IR\TIQAPTQVPVWSP 
RNG^\ljnVTLQTVPLTTVIASTDPSAGTGSQKFILQAiPSSQP 
MTVIfKENVMLQSQKAGSPPS IVLGPARV\QQVLTSNVQTI CNGT 
VSV\ASSPSFS \ATAPVVTLFLIjGSSQLVAHPPGTVITSVI ktq 
BTKTLTQEVEKKES EDHLKENTE KTEQQ PQPYVMWS S SNGFTS 
QVAMKQNELLEPNSF 


5404 


1B7 


1111 


LPVTLI FAKMKTLQSTI*LLLLLVPLIKPAPPTQQDSRt 1 YDYGT 
DNFE ES I FS QD YE D KYLDG KN I KBKETVI I PNEKSLQLQKDE AI 
TPLPPKKENDEMPTCLLCVCLSGSVYCBEVDIDAVPPLPKESAY 
LYARPNKIKKLT\AXDFADIPNLRRLDFTGNIiIEDIEDGTFSBCL 
SLVEELSLAENQLLKLPVLPPKtiTLFNAKYlTKIKSRGIXANAFK 
KLNNLTFIiYIiDHKALESVPLNLPEStiRVIHLQFNNIASITDDTF 
CKANDTS Y IRDR I EEX R LEGNP I VLGKH PNS FI CLKRLPIGS Y F 


5405 


2199 


1520 


QNS RSLHMDPQNQHGSGSSLWI QQPSLDSRPRLDYEREIQ PTA 
ILSLDQIKAIRGSNEYTEGPSWKRPAPRTAPRQEKHERTHB 1 1 
PliJVNNNYEHRHTSHLGHAVLPSNARGP ILSRSTS TGSAASSGS 
NS SASS EQGLLGKSPPTRPVPGHRSERAIRTQPKQL I VDDLKGS 
LKEDLTQHKFICEQCGKCKCGECTAPRTLPSCLACNRQCLCSAE 
S MVEYGTCMCI*\ VKGI F YKCSNDDEGDS YS DNPCS CSQSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDWIHRPGCRCKNS 
NTVYCKLES CP SRGQG KPS 


5406 


279 


2732 | RWRT YNVEG PLTFMDVAIE FCLEE WQCLDTAQQNLYRNVMLENY 
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Predicted 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"™ 
IA=Alanine, C=Cysteine, D*Aspartic Acid, E» 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PaProline, C;*Glutamine, R=Arginine, 
S=Se rine, T=Threonine, V=Valine, 
W=»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion 
\=possible nucleotide insertion) 








RNLVFLG/ 1 IAVSKPDLITCLEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPEQHIKDPFQKATLRRYKNCEHKNVHLKKDHKSVDB 
CKVHRGGyNGFNQCLPATQSKIPLFDKCVKAFHKFSNSNRHKIS 
HTEKKLFKCKECGKSFCMLSHLAQHXI IHTRVNFCKCEXCGKAF 
NCPSIITKHKRINTGEKPYTCEBCGKVFNWSSRliTTHKKNYTRY 
KLYKCEECGKAFNKSSILTTHKIIRTGEKFYKCKECAKAFNQSS 
NLTEH KKIHPGE KP YKC EE CGKAFNW P STLTKH KRIHTGE KP YT 

CEECGKAPNQFSNLTTKKRrHTA\EKFYKCTECGEAFSR5\SNL 
TKHKEIHTEKKPYKCRRCG KAPKWQ <5 tfT .TEH yi .TUTrtPVDvvrc 

KCG KAPNCPS 1 1 TKHNR INTGB KPYTCEECG K VFNWSSRliTTH K 

KNYTRYKLYKCBSCGKAFNKSSILTTHKKIHIEKXFYKCEECGK 

APKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT 

GBKPYKCEECGKAFTQSSNLTTHKKlH'itSEKFYKCEECGKAFTO 
SSWIjTTHKKIHTGGKP YKCEEfRJfa BTJfYFQTT/PvtftrT Ttrrpw irn 

YKCEECGKAFIG'JSSTLTKHKIIHTGEJCPYKCEECG\KAFKLSST 
LSTH KI I HTGE KP YKCE KCGKAPNR P SNLI EHKKIHTGEQ P YKC 
EECGKAFNYSSHLNTHKRIHTKEQPYKCKECGKAFNQYSNLTTH 
NKIHTGEKLYKPEDVTVILTTPQTFSMI K 


54 07 


3 


653 


RPRRRQSSCCTGWIiAGWLLRAAPRFCRRTETDMEGGKGLAVLII, " 
AH LLQGTLAQS IKSNHLVKVYDYQEDGSVLLTCDAEAKNITWF 
KDGKMIGFLTEDXKKWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCIELNAATI SGFLPAEI VS I FDLAVGVYF I AGTGMEFR 
QS\PJlSDKQTLLP\NDPAPTQPLKDPRKMTQYSHLQGN\QliRRN 


540B 


2745 


6123 


QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRiP 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSri 
APRP VPASRGGKTLCKGYRQAPPGPPAQFQRP I CSAS PPWASRP 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RLPTDLDTGGPWFPHYDFERSCWVRAISQEOQIJVTCWQAEHCGE 
VRNKDMS WPEEMS FIANS S KIDRHKVPTEKGATGLS NLGNTC FM 
NSSIQCVSNTQPLTQYFISGRHLYELNRTUPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFWGPQQQDSQELIiAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAV?DmHiRRNRS 
X WDLFHGQLRS QVKCKTCGHI SVRFDP FNFLSLPLPMDS YMHL 
ElTV X KLDGTTP VR YGLRLMflDEKYTGL KKQLSDLCGLNSE Q I L 
IAEVHGSNIKNFPQDNQKVRLS VSG PLCAFE IPVPVS PISASS P 
TQTDFS S S PSTNEMFTt»TTNGD3jPR P I F I PNGMPNTW PCGTEX 
NFTNGMVNGHMPSLPDSPFTGYriAVHRKMMRTELYFLSSQKNR 
PS LFGM PIjI VPCTVHTRKKDLYDAVW I QVS RX»AS PL P PQEASNH 
AQDCDDSMG YQYPFTLRWQ KDGN5CAWCP WYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHIiRYQTSQERVVDEHBSVEQSRRAQ 
VSPINLDSCLRAFTS BEELGENEM YYCS KCKTKCLATKKLDLWR 
LPPILIIHLKRPQFVNGRWIK3QKIVKFPRESFDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPR I LAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKKSSPNSSPRTLGRS 
KGRLR LPQ I GSKNKLS SSKENLDAS KENGAGQ I CELADALS RGH 
VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCGNGYSNGQLG 
NHS BEDSTDDQR EDTR I KPI YNLYAI SCHSGILiGGGHYVT YAKN 
PWCKWYCYNDSSCKELHPDEIDTDSAYI LFYEQQGI DYAQFLP K 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


540? 


2745 


6128 


QGSKGTCHPOAyQPWDEG^QEXpiOSB^PWGQ^QEPPTMPQRLP 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSL 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREDTYPVGTQGVPSLAIiAQGGPQGSWRFLEWKSMP 
RLPTDLDI GGPWFPHYDFERS CWVRAISQEDQLATCWQAEHCGE 
VRNKE)MSWPEEMSFIANSSKIDRHKVPTEKGATGLSNLGNTCFM 
NS S IQCVSNTQPLTQYF I SGRHLYELNRTN P IGMKC-HMAKCYGD 
LVQELWSGTQKNVAPLKLRWT IAKYAPRFNG FQQQDSQELLAFI* 
LDGI^DLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 
I WDLFKGQLRS QVKCKTCX3H I SVRFDP FN FLSL PL PMDS YMHL 
EI TVI KLDGTFP VRYG LR LNMDEKYTGLKKQLS DLCGLNS EQ I L 
LABVHGSNlKNFPQDNQKVRLSVSGFtCAPEIPVPVSPISASSP 
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Amino acid segment containing signal peptide 
(Ac Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G -Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine / M=Methionine, N=»Asparagine , 
PsProline, Q=Glutamine, R»Arginine, 
S=Serine, T^Threoniae, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apos3ible nucleotide insertion) 




• 




TQTDFSSSPSTNEMFTLTTNGDLPRPIFIPNGMPNTWPCGTEK 
NFTNGMVNGHMPSLPDSPFTGYI I A VHRKMMRTELYFLSS Q KNR 
PSLFGMPLI VPCTVHTRKKDLYDAVW IQVS RIiASPLPPQBASNH 
AQDCDD SMGYQ YPFTLRVVQKDGNS CAWCP WYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTAliHLRYQTSQBRWtEHESVEQSRRAQ 
VEP I NLDS CLRAPTS EEE LGENEM YY CSKC KTHCIJVTKKLDIiWR 
LPPILIIHLKRFQFVNGRWIKSQKIVKPPRBSFDPSAPLVPRDP 
ALCQHKPLTPQGDELSEPRI LARBVKKVDAQSS AGEEDVLLSKS 
PSSLSANI ISSPKGSPSSSRKSGTSCPSSKNSS PNSSPRTLGRS 
KGRLRLPQIGSKNKLSSS KENLDASKENGAGQI CELADALSRGH 
VLGGSQPELVTPQDHEVALANGPLYEHEACGNGCGNGYSNGQLG 
NHSBEDSTDDQREDTRIKPIYWLYAISCHSGILGGGHYVTYAKN 
PNCKWYCYKDS SCKELHPDEIDTDSAYI LFYEQQGI DYAQFLPK 
TDG KKMADTS S MDEDFES DY \EKYCVLQ 


5410 


2 


710 


IiRPPGQARHVWIAARMQAPHKEHL YKLI* VIGDLG VGKTS 1 1 KRY 
VHQNFSSHYRATIGVDFALKVLHWDPETWRLOLTOIAGQERFG 
NMTRVYYREAMGAFIVFDVTRPATFEAVAKWKNCLDSKLSLPN3 
KPVSWLIoANKCDQGKDVLMNNGIaKMDQFCKEHGFVGWFETSAK 
ENINI DEASRCLVKHI LANECDIJ4ES 1 E PDWKPKT k T<?TX va <;r* 
SG\CAKI LVGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKALGSFGKPS PVTGLRAARRRRTRPS APAAPS VGC 
GKRRBSDAGAGGERASVRTGSGRRGGRTMAGDSEQTLQNHQQPN 
GGEP FL IGVSGGTASGKS S VCAK I VQ LLGQNEVDYRQKQ WILS 
QDSPYRYLTSEQKAKALKGQFNFDHPDAFDNELILKTLKEITEG 
KTVQIPVYDFVSHSRKEETVTVYPADWLFEGIliAFYSOBR/lR 
DLFQMKLFVDTDADTRliSRRVLKDISERGRDLEOILSSSTLRFV 
KPA\FEEFCLPPK\KYADVIIPR\GADN\RVPINLrVQHIQ\DI 
LNGGPS\NRQTNGCLNGYTPSRKRQASESSSRPH 




3180 


313 


QGISNFFHKEANFWFEVSG YL ISPLRSPFVDPALBWSLMAS PWN " 
KMEGESSRFEIHTPVSDKKKKKCS IHXERPQKHSHE I FRDSSLV 
NEQSQITRRKKRKKDFQHLISSPriKKSRICDETAKATSTLKKRK 
KRRYSALEVDEEAGVTVVLYDKENINNTPKHFRKDVDWCVDMS 
IEQKLPRK\PKTDKFQVLAKSH\AHKSEALHSKVREKKNKKHQR 
KAASWESQRA\RI>TIiPQSEFPTQEESWI/SVGPGGEITBt»P\A5A 
HKNKS KKKKKKS SNRE YET \ LAMPEGS QAGRE AGTDMQESQPT7 
GI^DETPQLWPTHKKKSKKKKKKKSNHQEFESLAMPEGSQVGS 
EVGADMQES \RPAVGLHGETAG I PAPAYKNKSKKKKKKSNHQEF 
EAVAMPESLESAYPEGSQVGSEVGTVEGSTALKGFKBSNSTKKK 
SKKRKLTSVJCRWIVSGDDPSVPSKNSBSTLFDSVEGDGAMMBEG 
VKSRPRQ KKTQACLAS KHVQEAPRLE PANEEHNVETAEDS B I RY 
LSADSGDADDSDADLGSAVKQLQEFI PNIKDRATSTI KRMYRDD 
LERFKEFKAQGVAI KFGKFS VKEWFCQt»EKNVEDFLAliTGIESAD 
KLLYTDR YPE BKS VI TNLKRR YSFRLH IG \ RNXARPWXLI YYRA 
KKMFDVNNYKGRYS EGDTEKLKMYHSLLGNDWKTI GEMVARRSL 
SVALKFSQlSSQRNRGAWSKSETRKLIKAVEEViriKKMSPQBLK 
BVDSKLQENPESCLS IVREKLYK3 ISWVEVEAKVQTRNWMQCKS 
KWTEILTKRMTNGRR I YYGMNALRAKVSLI ERLYE INVEDTNEI 
D WEDLAS AI GDVPPS YVQT KFSRLKAVYVP FWQKKTFPB I IDYL 
YETTLPLLKE KLE KMMEKKG TKI QTPAAP KQ VFPFRD I FYYBDD 
S EGGGHRKRKRRPRRHAWFTPVI PVLWEAKAGWI I 




3753 


1304 


R P PAGVAPR RAMANVSKKVS WSGRDRDDEEAAPLLRRTARPGGG " 
T?I»LNGAGPGAARQSPRSALFRVGHMSSVKliDDELIiEP \DMDPP 
HPF P KE I PHNEKLLSLKYESLDYDNSENQL FLEEERRI NHTAFR 
TVEIKRWVICAL1GILTGLVACFIDIVVE11LAGLKYRVIRGNID 
KFTEKGGLSFS LLIiWATLNAAF VLVG3VI VAF I EPVAAGSG I PQ 
IKCFLNGVKIPHWRLKTLVI KVSGVTLSWGGLAVGKEGPMIH 
SGSVIAAG ISQGRSTSLKRDPKI FEYLRRDTBKRDFVSAGAAAG 
VSAAFGAPVGGVLFSLEEGASFWNQFLTMRIFFASMISTFTLNF 
VL3 1 YHGNMWDLSS PGIiINFGRFDSBKMAYTlHEIPVFIAMGW 
GGVLGAVFNALN YWLTMFRIR YIHRP CLQVI EAVLVAAVTATVA 
FVLIYSSRDC^Pl^GGSMSYPLQLFCADGEYNSMAAAFFNTPEK 
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ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o€ 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, DsAspartic Acid, B* 
Glutamic Acid, F=Phenyl alanine, G~Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
I>=Leucine, M=Methionine, NoAsparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V*Valine, 
WsTryptophan, YeTyrosine, X=Unknown, *=Stop 
Codon. /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S WS LPHDPPGS YNPLTLGLFTLVYFPIiACWT VGLT VSAG VFIP 
SLLZ GAAWGRL ?GIS LS Y LTGAAI WADPGKYALMGAAAQLGG I V 
RMTLSI/TVIMMEATSNVTYGFP IMliVLMTAKI VGDVFIEGLYDM 
HIQLOSVPFLHWEAPVTSHSLTARHVMSTPVTCLRRRBKVGVIV 
DVLSDTASNHNGFPVVEHADDTQPARtOGLILRSQLIVLtiKHICV' 
FVERSNI/3LVQRRLRLKDFRDAYPRFPP IQS IHVSQDEREC1M) 
LS BFMNPSP YT VPQE AS LP RVPKLFRALGLRHLWVDNRNQ WG 
LVTRXDIiARYRLGKRGLEELSIAQr 


<Ul4 


2130 


390 


GVASAWDRAL FS PLLSPTSRVFRTS PPRC VSTETGRRDRARVPS 
QWC3VLQGKLPVSGRTSIACVRSILLSPASSPRKVGIVGGTGAR 
AGAAPRDHGRVRHRRPSSARRMTRTTGQCLAPRGCQGPRGTRSP 
RS PRSRTRRGCSASPACLP /CRSALI VAVLCYINT iNYMDRFTV 
AGVLPDIEQFFNIGDSSSGLIQTVFISSYMVLAPVFGYLGDRYK 
RKYLMCGGIAFWSLVTLGSSFIPGEHPMLLLLTRGLVGVGEASY 
STIAPTLIADLFVADQRSRMLS I FY7AI PVGSGLGYIAGSKVKD 
MAGDWHWALR VT PGLG WAVLLLFL WREPPRGAVERKSDL P P L 
NPTS W WADLRALARNPS FVIiSSLG FTAVAFVTGSLAIiW APAFLL 
RSRWLGETPPCLPGDS CSSSDSLI FGLITCLTGVLGVGLGVBI 
SRRLRHSNPRADPLVCATGI»LGSAPFLFLSLACARGSIVATY I F 
IFIGETLXjSMNWAIVAD I LLYWI PTRHSTAEAFQIVX.SHLLGD 
AGS P YL XGIiI S DRIJiRNWPPS FLSE FRALQFSLfCiCAFVGALGG 
AAFLGTAHLH 


5415 


693 


298* 


IPPKTKLELQKH\LTXLT\NQEQATIFEEVQKLRPRNEQRENEL 
IISFLRCLFBEKQKEHIHIGEMKQTSQMAAENIGSELPPSATRF 
RLDMLXNKAKRSkTESLES I LSRGNKARGLQKHS I SVDLDSSLS 
STLSNTSKEPSVCEKEALPISESSFKLLGSSEDLSSDSESHLPE 
EPAPLSPQQAFRRRANTLSHFPIECQEPPQPARGSPGVSQR1CLM 
RYHSVSTErPHERKDFRSKANHLGDSGGTPVKTRRHSWRQQIFL 
RVAT PQKACDS SS R YEDYS ELGELP PRS PLEPVCEDGPFG P PPE 
EKKRTSRELRELWQKAILQQILLLRMEKENQKLQASENDLLNKR 
LKLDYEE I TPCLKBVTTVWEKMI»STPGRSKIKFDMEKMHSAVGQ 
GVP\RHHRGEXWKFLAEQFHLKHQFPSKQQPKDVPYKELIjKQLT 
SQQHAILIDLGRTFPTHPYFSAQLGAGQLSLYNILKAYSLLDQ3 
VGYCC»LSFVAGILL1«MSEEEAF1QMLKFLMFDMG1*RKQYRPDM 
IILQIQMYQLSRLLHDYHRDLYNHLEEHSIGPSLYAAPWFLTMF 
ASQFPLGFVARVFDMIFLQGTEVIFKVALSLLGSHKPLIXjQHEN 
LSTI VDFIKS TL PNLGLVQMEKTINQ VFEMDIAXQLQAYE VB YH 
VLQEELIDSSPLSDNQRMDKLEKTMSSLRKQNLDLLEQLQVANG 
RI QSLBAT IEKLLSS E SKLKQAMLTLELERSALIiQT VEELRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KSQLFCFWGGKAGDI LSGDQDKEQKDP YFVETP YG YQLDLDFLK ~ 
YVDDIQKGNT IKRLNIQKRRKP S VPCPEP RTTSGQQG IWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPIS3CPPPPLETSLPFLTIP 
ENRQLPPP S PQLPKKNLHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLASFGGMGTa^SLPSFVGSGNHNPAKHQLONGYQGNGDYG 
SYAPAAPTTSSMGSS IRHSPLSSGlSTPVTNVSPMHliQHIREQM 
AIAItKRLKELEEQVRTIPVLQVKI S VLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASOLEQLSRARRSGGBLYIDYEEEEME 
TVEQSTQRTKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RS VAVGAE ENMITO I VVYHRGSRS CKDAAVGTLVEMRNCGVSVTE 
AMLGVMTEADKEISLQQQTIESLKEKIYRLEVQLRETTHDREMT 
KLKQELQAAGSRKKVDKATMAQPLVFSKVVEAVVQTRDQMVGSH 
MDLVDTCVGTS VE TNS VGISCQ PECKNKVVGPELPMNW WI VKER 
VBMHDRCAGRS VEMCDKS VSVBVSVCETGSNTEES VWDLTLLKT 
NIMiKEVRSIGCGIxrsVDVTV'CSPKEC^RGWTEAVSQVEAAV 
MAVPRTADQDTSTDIiEQVHQFTNTETATL I ES CTNTCLSTLDKQ 
TS TQTVETRTVAVGEGRVKD IMS STKTRS IG VGTLLSGH5GFDR 
PSAVKTKESGVGQIN1NDNYLVGLKMRTIACGPPQLTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIER1QKLLAEQQ 
TL1AENYSELAEAFGEPHSQMGSLMSQLISTLSSINS7MKSAST 
BELRNPDFQKTSLGKITGSYLGYTCKOGGLQSGSPLSSQTSQPB 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
<A«Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidina, I«Isoleucine, K^Lysine, 
L=*Laucine, M-Methionine, N^Asparagine, 
P-Proline, Q=Glutamine, R-Arginine, 
S-Serine, TVThreonine, V«Valine, 
W-Tryptophan, Y=Tyrosine, X=* Unknown, *=»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEVGTSEG KP ISS LDA FPTQ EGTLS P VNiTODQIAAGL YACTNN 
EST^KSIMKKKDGNKDSNGAKKMLQFVGINGGYETTSSDDSSSD 
ESSSSESDDECDVIEYPLEBEEEBEDBDTRGMAEGHHAVNIEGL 
KSARVEDEMQ VQE CE PEKVE I RERYELS B KMLS ACNL L KNT I ND 
PKALTSKDMRFCLNTLQHEWFRVSSC2KSAIPAMVGDYIAAPEAI 
SPDVLRYVINLADGNGNTALHYSVSHSNPEXVKLLI»DADVCNVD 
HQNKAGYTPIMIiAALAAVEAEKDMRIVBELFGCGDVNAKASQAG 
QTALMIAVSHGR I DMVJCGLLACGAD VNI QDDEGSTALMCASEHG 
HVE IVKLLLAQPGCNGHLEDNDGSTALS I ALEAGHKDIAVTjLYA 
HVNFAKAQSPGTPRLGR KTS PGPTHRGS FD 


5417 


27 


4074 


XSQIiPCPWGGXAGDILSGDQDKEQKDPYFVETPYGYQIiDLDFLK 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
I.SSSNSDDNKQCPNFI*IARSQVTSTPI S KPPP PLETSLPFLT I P 
ENRQLPPPSPQLPKHNLHVTKTLI^ETRRRLEQERATMQMTPGBF 
RRPRLASFGGMGTTS S L PSF VGSGNHNPAXHQLQNG YQGNGDYG 
SYAPAAPTTSSMGSS I RHSP LS SGI STPVTNVSPMKLQHI REQM 
AIALKRLKELEEQVRTI PVLQVKIS VLQEEKRQLVS QLKNQRAA 
SQINVCGVRKR3YSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQ3TQRIKEFRQL\TADMQALEQKIQDSSCEASSEIiRENGEC 
RSVAVGAEENMNDI VVYHRGSRSCKDAAVGTLVEMRNCG VS VTE 
AP0LGVmEAI)KEIELQQQTIBSLKEKIYia,BVOr^ETTHDREMT 
KLKQ E LOAAG S RKKVD KATMAQ P LVFSKVVE AWQTRD QMVGS H 
MDLVIm^GTSVETWSVGISCQPECK^ncWGPELP^1^^7WrVKER 
VEMHBRCAGRSVEMCDKSVSVEVSVCETGSNTEBSVNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKECAS RGVNTEAVSQVEAAV 
MAVPR TADQDTS TDL EQVHQF TNTETATL I HS CTNTCL S ThDKQ 
TSTQTVETRTVAVGEGRVKDINS STKTRS IGVGTTiLSGHSGFDR 
PSAVXTKBSGVGQININDNYLVGLKMRTIACGPPQJ.TVGLTASR 
RS VG VGDDP VGES LENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQLIST1.SSINSVMKSAST 
EELRNPDFQKTSLGKITGSVLGYTCKCGGLQSGSPLSSQTSQPE 
QBVGTSEGKP ISS LDAPPTQEGTLSPVNLTDDQIAAGL YACTNN 
ESTLKSIMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSSSD 
ESS SS ESDDBCDVIEYPLEEEEEEEDBDTRGMAEGHHAVNI BGli 
KSARVEDEMQVQHCEPHKVSIRERYEIiSEKMIiSACNLLKNTIND 
P KALTS KDMRFC1NTLQHE WFRVSSQKSA I PAMVGDY IAAFEAI 
S PDVL, RYV I NLADGNGNTALHYS VS H SNFE I VKLLLDAD VC!NVD 
HQNKAGYTPIMLAAIiAAVEAEKD^IVEELFGCGDVNAKASQAG 
OTALMLAVSHGRIIXvIVKGlJACGADVNIQDDEGSTALMCASEHG 
HVBIViOjIJ^QPGCNGHLEDNDGSTALSIALEAGHKDIAVLLYA 
HVNFAXAQS PGTPRLGRKTS PGPTHRGSFD 


5418 


24 


1133 


S VPRAGGDMHTGAAELY DQ ALLG I LQH VGNVQDFIi RVLFGFLYR 
KTDFYRLLRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQEIiEEKIRRKEEBEAKTVSAAAAEKEPVPVPVQErEIDSTTEL 
DGHQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGAAEVPR\EP?I 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQVSVALSSSS I R VAML EENGERVLMSGKLTHKINTES SLWS L 
EPGKCVLVNLS KVGEYWWNAILEGEEPIDIDKINKERSMATVDE 
EEQAVLDRLTFDYHQKLQGKPG^HELKVH^KKGWDASGSPFR 
GQRFDPAMFNISPGAVQF 


5419 


1395 


259 


GTHPI^PDLVSRTSVQGPl^TMACPGMSD^EESPFLGPRAAEEG^ 

SESEACBAFGRRKSEEEGRRSDTSGFGRERKHKVNWKHPERADA 

iGDPASLPQC/LGP/DCVRPAQPSSKYCSDDGGMKLAANRIYEIIi 

PQRIQQWQQSPCIABEHGKKLLBRIRREO^SARTRLQEMERRFH 

EtiEAIILRAKQQAVREDBESNEGDSDDTDLQIFCVSCGHPINPR 

VALRKMERC^AKYESQTSFGSMYPTRIEGATRLFCDVYNPQSKT 

YClOiLQVLCPEHSRDPKVPADE VCGC PLVRDVFELTGDFCRL PK 

RQCNRHYCWEKLRRAEVBLERVRVWY KLDELF EQERNVRTAMTN 

RAGXrALMLHQTIQHDPLTTDLRSSADR 


5420 


117 


1733 


NEAGOACPFKGGASGRLYI*SPRLPRVSVAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
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ID 
NO: 


Predicted 
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amino acid 
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Predicted end 
nucleotide 
location 
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reoidue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A« Alanine, C=Cysteine, DaAspartic Acid, E*> 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I,yaine, 
L=Leucine, M-Methionine, NoAsparagine, 
P»Proline, Q=31utamine, R-Arginine, 
S^Serine, T»Threonine, V«Valine, 
W«Tryptophan, Y»Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KCIISlliLFATI*yiLCHIP^RFKKPAEPTT\GM>IKMPPSTRX7~ 
LLELCTFTLAIALGAVLLLPPSI ISN3VLLSLPRNYYIQWLKGS 
LIKGLWNLVPLFSNJ^SLIFLMPPAYFPTBSBOFAGSRKGVLGRV 
YBTWMLMLLTLLVLQMVWVASA I VDKNXANR E5LYDF WE Y YLP 
YLYSCISFI^VLLLLVCTPLGLARMFSVTGKLIjVKPRIjLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVtiALQTQRVL 
LEKRRKASAWQRNLGYPIiAMLCLLVLTGLSVLIVAIHILELLID 
EAAMPRGMQGTSLGQVS FSXIX3SFGAVIQWLIFYLMVS S WGF 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTLGL 

TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RABI»IRAFGERE 


5421 


117 


1733 


NEAGGACPFKGGASGRL YLS PRLPRVS VAGCEER? LGWVW VLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVRBQLFHERIR 
ECI I STLLFATLYI LCHI FLTRFKKPAEFTT\GMMKMPPSTRL/ 
LLELCTFTLAIALGAVLLLPFS IISNEVLLSLPRNYYIQWLNGS 
LIHGLNNLVFLFSNLSLI FLMPFAYFFTESEG FAG SRKG VLGR V 
YETVVMLMLLTLLVT^MVWASAIVDKNKANRESIiYDFWEYYLP 
YLYS C I SFLG VLHiLVCTPLGLARM FS VTGKLLVKPRLLEDLEB 
QL YCSAFEEAALTRR ICN PTSCWL P IiDMELIjHIIQ VLALQTQR VL 
liEKRRJCASAWQRNLGYPLAMLCLLVLTGIjS VLI VA IHI LELL I D 
EAAM PRGMQGTS LGQVS FS KLGS FGAVI QWLI F YLMVSSWGF 
YSS PLFRSLRPRWHDTAMTQI IGNCVCLLVLSSALP VFSRTLGL 
TRFDLLGDFGR FN W LGN P Y I VFL YNAAFAGLTT LC LVKT FTAAV 
RAEIilRAFGERE 


5422 


3 


1263 


SCGESLPTWLAGASRPG I GRKGGAWGGRGGSS PAQ VLLS PGPVF 
KAGCNWVfHLSRDQAGVQRCDLGSSQPPPLGFKRFSCLSLPSSND 
YRSTVLCVSKMEADIiSGFNIDAPRWDQRTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMGVVPPGTQVEQI.T.YAKKLYDSAF 
HPDTGEKMNV3GRMSFQLPGGMIITGFMLQFYRTMPAVIFWQWV 
NQSFNALVNYTNRNAASPTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKAP P LVGRW VP FAAVAAANCVNI PMMRQQELI KGICVKDRN 
BNEIGHSRRAAAIGITQVVISRITMSAPGMILLPVrMERLEKIiK 
FMQKVKVL/SAPLQVMIiSGCFLIFMVPVACGLFPQKCZBLPVSYL 
EPKLQDTI KAKYGELEPYVYFNKQL 


5423 


3186 


905 


GVSMALGEEKAEAEASEDTKAQS YGRGS CRERELOI PGP'MSGEQ 
PPRXEAEGGLISPVWGAEG I PAPTCWIGTDPGGPSRAHQPQASD 
ANREP VAERSEPALSGIiPPATMGSGDLLLSGES QVEKTKLSS SE 
E FPQTLS LPRTTI CSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKS VLS PGSAAQPSSCS ISASSTGSSLQGHQBRAEPRG 
GSLAKVSSSLEPWPQBPSSWGriGPRPQWSPQPVFSGGDASGIi 
GRRRLSFQAEYWACVLPDSLPPS PDRHS PLWNPNKE YEDLLDYT 
YPLRPGPQIiPKKLDSRVPADPVLODSGVDLDSFSVSPASTLKSP 
Tm^SPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGLASW 
SQLASTPRAPGSRnARPmRRBPAIJlGAKDRLriGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRKKSBEEVESDDEY 
1ALPAPXTQVSSLVS YLGS I STLVTL PTGDI KGQS PLEVSDS DG 
PASFPS3SSQSQLPPGAALQGSGDPEGQNPCFLR5FVRAHDSAG 
EGSU3SSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESL VOC \ VKTFC\ CQLEEL ICWLYNV\ AD VTDHGTPAR 
SNLTS LK \ SS I^JLYRQFKKD IDEHQSLTES VLQKG E ILLQCLLE 
NTPV^EDVI/JRIAKQSGE^ESHADRLYDSiriASLDMLAGCTLI P 
DKKPMAAMEHPCEGV 


5424 


3186 


505 


GVSMALGEEKAEAEASEDTKAQSYGRGSCRERELDIPGPMSGEQ ' 
PPRl*EAEGGLISPW7GAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANREPVAER5 EPALSGLPPATMG SGDLLLSGESQVEKTKLSS S E 
EFPQTLSLPRTTICSGHDAimiDDPSLAJDLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERABPRG 
GSIJ^SSSLEPWpQEPSSWGI^PRPQWSPQPVFSGGI)ASGL 
GRRRLS FQAE YWACVL PDSLP PS PDRHS PLWNPNKEYEDLLDYT 
YPLRPGPQLPKHLDSRVPADPVLQDSGVDLDS FSVS PASTLKSP 
TNVSPNCPPAEATALPFSGPRBPSLKQWPSRVPQKQGGMGLASW | 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secrment containina sianal npnfTrto" 
(A- Alanine , OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I»Isoleucine, K^Lysine, 
L-Leucine, MaMethionine, NaAsparagine, 
PoProline, QsGlutamine, R=Arginine, 
S=Serine, T-Threonine , V^Valine, 
W«Tryptophan, Y= Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQZJVSTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGW PSPR PEREKRTSQS AR RPTCTESRWKSEE EVES DDEY 
LAL PARLTQVS S LVS YLGSIS TLVTLPTGDI KGQS PLEVSDSDG 
PASPPSSSSQSQLPPGAALQGSGDPSGQNPCFIjRSFVRAHDSAG 
EGS LG S SQALG VSSG LLKTRP £ L PARLDRW P FSDPDVEGQIiPRK 
GGEQGKBSLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
SNLTSLKXSSLQLYRQFKXDIDEHQSLTESVLQKDBILLQCLLE 
NTPVLEDVLGR IAKQSGELESHADRIiYDS ILASLDMLAGCTLI P 
DKKPMAAMBHPCEGV 


S425 


1086 


115 


GFCPSPSLGHQPPRVLHPTMSMAVETFGFFMATVGliLMLGVTLP 
NS YW RVST VHGNVI TTNT I FENLWPSCATDSLGVYNCWBFPSML 
ALSGYIQACRALMITAILLGFLGLIiLGIAGLRCTNIGGLELSRK 
AKLAATAGAPH \ ILPGICGMVAI \SN YAKNITR\DFSDPLYPGT 
KYELGPALYLGWSASLISILGGLCLCSACCCGSDEDPAASARRP 
VQAPVSVMPVATSDQEGDSSFGlCYGRNALRVAAtiCRGPRCLPTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVUISIIGPSQAK 
NCS WEVAYLPSEAGSLI F 


5426 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP"* 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTS FGRRLLVLI PVYLA 
GAVGLSVGFVLFGLALYLGWRRVRDEKEKSLRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 
LLAETVAPAVRGSNPHLQT FTFTRVELGBKPLR I IGVKVHPGQR 
KEQ ILLDLNIS YVGDVQI DVBVKKYFCKAGVKGMQLHGVLRVTL 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 
TMIMDS I AAFLVLPNRLLVPL VPDLQDVAQLRS PLPRG I 1 R 1HL 
LAARGLSSKDKYVKGL IEGKSDPYALVRLGTQTFCSRV tDEELN 
PQWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWPPI,QGGQGQVHLRLEWI*SLLSDAEKLEQVLQWNWG 
VS SRPDPPSAAILWYLDRAQDLPM VTS EL YP PQL KKGNKE PNP 
MVQLS I QDVTQES KAVYSTNCPVWE EAFRFFLQDPQSQBItDVQV 
KDDSRALTLGALTLPLARLLTA PEL ILDQWFQLS S SGPNS RLYM 
KLVMRI L YLDSSE I CFPTVPG CPGAWDVDSBNPQRGSfi VDAPPR 
PCHTTPDSQPGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHWREDLNP RWNBVFEV1 VTS VPGQELE VEVF 
DKDLDKDDFLGRCKVRLTTVLNSGFLDEWLTLEDVPSGRLHLRL 
ERLTPRPTAAEIJSBVLQVNSLIO/rQKSAELAAALLSTYMERAED 
LPLRKGTKHLSP YATLTVGDS SHKTK7T SQTS AP VWDES ASFLI 
RKPKTESLBLQVRGEGTGVLGSZ>SLPLSELLVADQLCLDRWFTL 
SSGQGQVLLRAQLG ILVSQHSG VEAHSHS YSHS S SSLSEEPELS 
GGPPHITSSAPEV\RQRLTHVDSPLBAPAGPLGQVXLTLWYYSE 
ERKLVS X VHGCRSLRQNGRDP PDPYVSLLLLPDKNRGTKRRTS 0 
KKRTLS PEFNERFEWELPLDEAQRRKLDVSVKSNSS FMSREREL 
LGKVQLDLAETDLSQGVARWYDr^lDNKDKGSS 


5427 


42 


343S 


ATSSQSLGRADPPRCWTMBRSPGEGPSPSPMDQPSAPSDPTDQP"" 

PAAHAKPDPGSGGQPAGPGAAGEALAVLTSFGRRLLVLIPVYLA 

GAVGLSVGFVLFGLALYLGWRRVRDEKERSLRAARQLLDDEEQL 

TAKTLYMSHRELPAWVSFPDVBKAEWLNKIVAQVWPFLGQYMEX 

LLAETVAPAVRGSNPHLQTFTFTRVELGEKPLRIIGVKVHPGQR 

KEQI L LD LNI S YVGDVQID VEVKKY FCKAGVKGM QLHGVLR VIL 

EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 

TMIMDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IR IHL 

LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRVIDEELN 

PQWGBTYEVMVHBVPGQBIBVEVFDKDPDKDDFLGRMKLDVGKV 

LQAi^LDDWFPLO^GQGQVHLRLEWLSLLSDAEKL EQVLQWNWG 

VSSRPDPPSAAILVVYIiDRAQDLP^4VTSELYPP0LKKGNKEPNP 

MVQLS IQDVTQES KAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 

KBDSRALTLGALTLPLARLLTAPBL ILDQWFQLS SSG PNSRLYM 

KLVMR IL YLDSS B I CFPTVPGCPG AWDVDSENPQRGS S VDAP PR 

PCMTTPDSQFGTBHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 

VXLKLAGR3FRSHVVREDLNPRWNEWEVIVTSVPGQELEVEVF 

D KD LJDKDDFLGR C KVRLTTVLNSGFLDEW LTLED VPS G RLHLRL 
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1 SEQ 
I ID 

NO: 


Fredicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spon di ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


" i# ** o x y 4 la. J. ueptlQQ 

(AaAlanine, C=Cyateine, D^Aspartic Acid, B« 
Glutamic Acid, F= Phenyl alanine, G=Clycine, 
HoHistidine, Ielsoleucine, K-Lysinc, 
L=Leucine, M*Methicnine, N-Asparagine, 
P=Proline, Q-Glut amine, R=Arginine, 
S -Serine, T=»Threonine , V= Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


P5428 






KRLTPRPTAAEI^EVLQVNSLIQTQ^AEIAAALLSIYMERAED 
LPLRKGTKHLS PYATLTVGDSSRKTKTI SQTSAPVWDES ASFIi I 
RKPHTESLELQVRGEGTGVLGSLSLPLSELIiVADQLCLDRWFTL 
SSGQGQVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPELS 
GGPPHITSSAPEV\RQRI*THVDSPLEAPAGPLGQVKLTLWYYSE 
ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLSP EFNER FEWEL P LDEAQRRKLDVSVKSNSS FMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 




3 


1839 


SSRSERLSACAIAPPWJLVSSRPARPASLQfePdKMVEDGAEBDED ' 
LVHFSVSELPSRGYGVMEE1RRQGKLCDVTLKIGDHKFSAHRIV 
LAASIPYPHAMFTraMMECKQDEIVMQGMDPSALEALINFAYNG 
N^A I DQ QNVQS h LMGAS FLQLQS I KDACCTF I, R ERLHPKSTCLGV 
RQ FAETMMCAVL YDA ANSFI HQH FVE VSMS EE FLAL PLE DVLEL 
VSRDELNVKSEEQVPEAAIAWVRYDRPrjRriT'PT \ dht aqmtt)t t 

PCRPQPLSDRVQQDDLVRCCHKCRDLVDEAKDYLLMPERRPHLP 
AFRTRPRCCTS IAGL I YAVGGLNS AGDSLNWE VFDP I AUCWEJR 
CRPMTTARSRVGVAWKGLLYAl GGYDGQLRLS TVQAYNTETDT 
WTRVGSMNSKRSAMGTVVLDGQIYVCGGYDGNSSLSSVETYSPE 
TDKWT WTSMSSNRS AA\ G VTVFEGRI YVS GGHDGLQ I FS S VEH 
YKHHTATWHPAAGMLNKRCRHGAASbGSKMWCGGYDGSGFLSl 
AEM YS S V\ ADQWCLIVPM\ HTRR \ SRVSLGGPAVGRLYAVWG VT 
TCQSNL\SSVGDVLTPETDO^TFM\APMACHEGGVGVGCIPLLT 


5429 
5430 


828 


202 


RREDALSSEGCLWPSE3TVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRERFHRFQP7Y P YLQHEI DLP PTISIjSDGBE PPP YQGP CTLQ 
LRDPEQQLELNRESVRAPPNRT1FDSDLMDSARLGGPCPPSSNS 
GISATCYGSGGRMEG P P P \T YS E VIGHY PGSS FQHQQSSG PPS L 
LEGTRLHHTHlAPliESAAIWSKEKDKQXGHPL 


5431 " 


441 


1507 


UKRRKRRRKKIMKTI QPKMHNS XSWAI FTG3UAALCliFQGVPVRS 
GDATFPKAMDNVTVRQGESATIJiCT IDNRVTRVAWLNRSTILYA 
GNDKWCLDPRWLL SNTQTQ YS IEIQNVDVYDEG P YTCS VQ^DN 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRH IS PKAVG PVSEDB YLE 1 QG ITRBQSG DYE CS ASNDV\ A 

APV\VRKVKVTVNYPPYISEAKGTGVPVGQKGTT J QCEASAVPSA 
EFQW YKDDKRL I / EGKKG VKVENRPFLS KLI FFNVS EHD YGN YT 
CVASNKLGHTNAS IMLFG PGAVS EVSNGTSRRAGCVWLLPLLVL 
HLLLKF 




2 


1312 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPV\ 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQlOCLEELELDEQQ 
KKRIiEAFXiTQKAKVG ELKDDD FER I SELGAGNGG WTKVQHR PS 

GLIMARKLIHIjE ikpai rnqi irelovlhecnsp yi vgfygafy 

SDGEISICMEHKDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLA 
YLREKHQrMHRDVKPSNILVNSRGEIKLCDFGVSGQLIDSMANS 
FVGTRS YMAPERLQGTHYSVQSI)I WSKGLSLVELAVGRY PXPPP 
DAKELE A I FGRPWDGEEGE PHS I S PRPRP PGRP VSGHGMDSRP 
AI4AIFELLDYIVNBPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 
DLKMLTNHTFrKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5432 
5433 


2 


1312 

j 
1 


AAAAPGSRilRXPIjPDRPHMAHGV l EAPPPPAPRSPAWRARS^V , \ 
LPG I T1NP\ T I AEG PS P \ TSBGAS EANI»VDLQKKLEBLEIjDEQQ 
K KRL EAFLTQKAK VGE L KDDD FER IS ELG AGNGGWTKVQ HHP S 
GLIMARKLlHLEIKPAIRNQIIRELQVriHECNSPYIVGFYGAFY 
SDGEISICMEHMDGGSLDQVLKEAKRI PEEILGKVS 1AVLRGLA 
YLREKHQlMHRDVKPSWIIjVNSRGEIKLCDFGySGQLIDSMANS 
FVGTRS YMAPERLQGTH YS VQS 11 1 MS MGLSLVELAVGRYP I PPp 
DAKELBAI FGRPWDGEEGEPHS I S PRPRP PGRPVSGHGMDSRP 
WIAlFEXIiDYrVNEPPPKLjPNGVFTPDFQEFVKKCLlKNPAERA 
^yMLTNHTFIXRSB^EVDFAGWIiCKTlfRLNQPGTPTRTAV 




!. 


1885 

3 
I 


SVQEDKVGFEtJPliHLCSWRARACPCTWPHC/CTGLLECIiCaFAGV 
jFGWPSLVFVFKNBDYFKDLOGPDAGPIGNATGQADCKAQDERF 
5L I FTLGS FMNNFMTFPTGYI FDRFKTTVARLI A I FF YTTATLI 
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ID 
NO: 


1 Predicted 
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amino acid 
residue of 
amino acid 
seqaence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide — 
(A*Alanine, CoCysteine, D~Aspartic Acid, Ea 
Glutamic Acid, F-Phenylalaninc , G^Glycine, 
H-Hiatidine, I=Isoleucine, K»Lysine, 
L«Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q«=Glutamine, R=Arginine, 
S«Serine, TVThreonine, V=Valane, 
w-Tryptophan, Y=Tyrosine, XsUnknown, *«Stop 
Codon, /^possible nucleotide deletion, 

\=DOSfiAblA nurO PnH i Hx» ■.' n o i \ 








IAFTSAG3AVI^FLAMP^TIGGII*FIiITN1iQIGNLFGQHRSTT~ 
ITLYNGAFDS SSAVFLI IKIiLYEKGISLR/ VI*LHLHLCLQ YLAC 
SrHFPpnAPGAHPIPTAPQLQLWPVPWEWHHKGREG/QQLSMKT 
GS YSQRSS FQRRKR PQGQGRSRNS APSGATL/ CSRR PAWHLVWL 

a »j.uunnii*c AVylAj^oljJjlJNWA^OJJPlAKVSTyTNAF AFTQFGVL 

OU?FWGIJ^DRI^KYQKEARl<TGSSTrj^^ 
LCLGFALCASVPILPLQYTjTFI LQVI SRS FLYGSNAAFLTIAFP 
S EH FGKLFGLVMAI»S AWSLLQ F P I FTL I KG S LQNDP FYVNVMF 
MIAIIoLTFFHPFLWRBCRTWXESPSAIA 


5434 


66 


6S2 


RyAALXISLIQHKLLWRNQHCSRCVIMSPAQSAGLN^LFyGSGK 
HGP? 1X3CS Q YPACDY VRP LKS S ADGH I VKVLEGQVCPACG ANLV 
LRQ GRFGMFIGCINYPECEHTBLIDKPDETA ITCPQCRTGHLVQ 
RRS RYG KTFHS CDRYPE CQFA1 NFKP I AGEC PECHYPLL I EKKT 
AQGVKHFCASKQCGKPVSAE 


f 543S 


4704 


• 


1 P G D S S QRLAfcM S N AKER KHAK KMRN QPTNVTLS S G FVADR G VKH 
! HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNBQSSSK 
GMF RfCKGG WKAGPKGTSQE I PKYI TASTFAQARAAEI S AMLKAV 
TQKSSNSLVFX)TLPRiiMRRRAKSHNVXRI,PRRLQEIAQKEAEKA 
VHQKKBHS KNKCHKARRCHMNRTLE FNRRQKKNI WLETH IW1IAK 
R PHMVEOCWG YCLGERPTVKSHRAC YRAMTNRCIiLQDtiS YYCCLE 
LKGKEEEILKALSGMCNIDTGLTFAAVHCLSGKRQGSLVLYRVN 
KYPRBMLGPVTFIWKSQRTPGDPSESRQLWIWLHPTLKQDILEE 
IKAACQCVEPIKSAVCIADPLPTPSQEKSQTBI,PDEKIGKKRKR 
KDDGBNAKPI KKI IGDGTR3PCLP YS WI S PTTGI 1 1 SDLTMEMN 
RFRLIGPLSHSILrEAIKAASVHTVGEDTEETPHRWWIETCKKP 
DSVSLHCRQEAIFELLGG I TSPAE I PAGTXLGLTVGDPRINL PQ 
KKS KALPNP EKCQDNEKVRQLLLEGVP VECTHSF I WNQD I C KS V 
TENKISDQDLNRMRSELLVPGSQLILGPHESK1PILLIQQPGKV 
TGBDRLGWGSGWDVLLPKGWGMAFW2 PFIYRGVRVGGLKESAVH 
SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLLEKYKRRPPAKRPN 
YVKLGTLAPFCCPWEQLTQDWESRVOAYEEPSVASSPNGKBSDL 
RRS EVPCAPMPKKTHQPSDEVGTS I BHPREAEBVMIIAGCQESAG 
PER ITDQEAS ENHVAATGSHLCVLRSRKLLKQIjS AWCGPSS EDS 
RGGRRAPGRGQQGLTREACLSIIiGKFPRALVWVSLSLLSKGSPE 
PHTMICVPAKEDFLQLKEDWHYCGPQESKHSDPFRSKIOKQKEK 
KKREKRQKP\GRASSDGPAGEEPVAGQEALTLGLW3GPLPRVTL 
HCSRTLLGFVTQGDFSMAVGCGEALGFVSLTOLLDMLSSQPAAQ 

RfST.VT T.T7DD71GT ftVDDRDTH TTSf» 

v ijijKlr it AoliWX Kc AKiAI EV 


5436 


1781 


635 


ASDS I PWSEARTTRKLAQRGCQWSLPERMPLWFCGLP YSGKSR" 
RAEELR VAIiAAEGRA VYVVTJDAAVLGAED PAVYGDSAREKAIiRG 
AIJUlSVERRLSRHDVVILDSLNYIKGFRYELY\CLARAARTPLC 
LVY C VRPGGP IAG PQVAG ANENPGRNVSVSWRPRAEEDGRAQAA 
GSSVLRELHTADS VVNGSAQADVPKEIiEREESGAABS PALVTPD 
SEKSAKHGSGAFYS P ELLEALTEiRFEAPDSRNRWDR P LFTLVGli 
EEPIiPLAGIRSALFEWRAPPPHQSTOSQPLASGSFLHQLDQVTS 

QFI S YTKMHPNNBNIiPQLANMFLQYLSQSLH 


5437 


739 - 


1672 


CQEAASEFGGPLH?PAMFLRRLGGV7LPRPWGRRKPMRPDPPYPE 
PRRVDSSSENSGSJDWDSAPETMEDVGHPKTKDSGALRVSRAASE 
PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 
WOHVDSGGTRRPGVSPEGGL\GVPGPGAPLEKPGRREKLLGWLR 
GBPGAPSRYlJSGPEECLQISTNLTLHIiLELLASALLALCSRPLR 
AALDTLGLRGP LGLWLHGLLS FLAALHGLHAVLSLLTAHPLHFA 

CLroLLQALVLAVSLREPNGDEAATDWBSEGLERBGBEQRGDPG 
KGL 


5438- " 


2443 ■ 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDUiARGKGRKEENKGSDRVS 
LAPPSLRRPMMCQSEARQGPELRAAKWLHFPQLALRRRLGQIjSC 
MSRPALKLRSWPLTVLYYIiLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGRIJfQVELPHWLRRPVYSLYIWTFGVNPIKEAAVE 
DI^HYRNLSBFFRRKLJCPOARPVCGUISVISPSDGRILKFGQVK 
UCEVEQVKGVTYS IiES FLGPRMCTEDLPFP PAASCDS FKNQLVT 
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ID 
NO: 


beginning 
nucleotide 
location 
co x re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
tAsAlanine, CaCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G -Glycine, 
H^Histidine, I~Isoleucine, K° Lysine, 
M"wcucanc< noneLaioniiic, N^Asparac^ine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, Threonine, V* Valine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *=>Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REGNELYHCVIYLAPGDYHCFHSPTDWTV5HRRHFPQSLMSVNP 
GMARWIKEL FCHNER VVLTGDWKHG FPS LTAVGAT\ NWGS IRI Y 
FDRDLHTKS PRHSKGSYNDFSFVTHTNREGVPMALR33HIiG/QS 
FNLGSTIVLI FEAPKDFNFQLKTGQKIRFGEALGSli 


5439 


2443 


1152 


*-^ K ^ K '»U*'Ai>UKUKi'WS5t>S i GDLLARGKGRKEENKGSDRVS 
IJU»PSLRRPMMCQSBARQGPELRAAKWr,HFPQLALRRRLGQl»SC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGRLNQ VEL PHWLRRP VYS LYI WTFG VNMKBAAVE 
JJL.HH XKWi.bh FFRRKLKPQARPVCGLHS VISPSDGRI LNFGQVK 
NCEVEQVKG VT YS LES FLG PRMCTEDLP FPPAAS CDS FKNQLVT 
RBGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSLMSVNP 
GMARW IKELFCHN3RWLTGDWKHGFFSIjTAVGAT\NMGS IRtY 
FDR DLHTNS P RHS KGS YNDFS FVTHTNREG VPMALRGEHLG /QS 
FNLGSTI VL I FEAPKDFNFQLKTGQK IRFGEALGSL 


5440 


£93 


253 


BPI FVTPDHRLVTHTHI V \QTFS PVNS \GQPPNYEMLKEEQEVA ' 
MLGAPHNPAPPMSTVIHIRSETSVPDHVWSLFNTIiFMNTCCLG 
FIAFAYS VKSRDRXMVGDVTGAQAYASTAKCLNI WALILG I FMT 
ILLIIIPVLWQAQR 


\ 5441 


2 


2054 


CRDGGKNGFMVS PMKPLE IXTQCSG PRMDPKICPADPAFFSFIN 
MSDLWVANIETGE ERRLTFCHQGLSNVLDDPKSAGVAT FVIQE B 
FDRFTGYWWCPTASWEGSEGIiKTLRILYEEVDESEVEVIHVPSP 
ALE ERKTDS YR Y P RT3S KNP K lALKIiAE FQTDS QGK1VS TQE KE 
LVQPFS SLFPKVE YI ARAGWTRDGKYAV/AM FLDRPQQWLQLVLL 
PPALFIPSTENEEQ\RLASARAVPRNVQPYVVYEEVTMVWINVH 
DIFYPFPQSEGEDEIiCFLRANECKTGFCHLYKVTAVLKSQGYDW 
SEPP3 PGEGEQSLTNAIWVNEETKLVYFQGTKDTPLEHHLYWS 
YEAAGEXVRLTTPGFSHSCSMSQNFDMFVSHYSSVSTPPCVHVY 
KLSGPDDDPLHKQPRFWASMMEAAKIFHPHTRSDVRLYGMIYKP 
HAWPGKKHPTVLFVYOTPQVQLVNNSFKGIKYLRLNTLASLGY 
AVWI DGRGSCQRGLR F3SGALKNQMGQ VEIEDQ VEGItQFVAE KY 
GFIDLSRVA1HGWSYGGFLSLMGLIHKPQVFKVAIAGAPVTVWM 
AYDTGYTERYMDVPENNQHGYEAGSVALHVEKLPNEPNRLLILH 
GFLDBNVHFFHTNFLVSOL IRAGKP YQLQVALPPVS PQI YPNBR 
HSIRCPESGEHYEVTLLHFLQEYL 


5442" 


X 


34 74 


CGQRSRRRS PDMPBAKPAAXKAP KGKDAP KGAPKEAP PKE APAE " 
APKEAPPEDQSPTAEEPTGVFLKKPDSV3VETGKDAVWAKVNG 
KELPDKPTIKWFKGKWLELGSKSGARFSFKESHNSASNVYTVEL 
HIGKVVLGDRGYYRLEVECAKDTCDSCGFNIDVEAPRQDASGQSL 
ESFKRTSEKKSDTAGELDFSGLLKKREVVEEEKXKKKKDDDDLG 
I PPE IWBLLKGAKKSEYEKIAFQYG I TDLRGMLKRLKKAKVEVK 
JCSAAFrKKLDPAYQVDRGNfKlKLMVEISDPDLTLFCWPKNGQEIK 
PSSKYVFENVGKKRILTINKCTLADDAAYEVAVKDEKCFTELFV 
KEPP VLI VTPLEDQQVFVGDRVEMAVEVS BE GAQVMWMKDG VEL 
TREDSFKARYRFKKDGKRHILI FSDWQEDRGR YQVITNGGQCE 
AEL I VEEKQLE VLQDI ADLTVKAS EQAVFKCEVS DEKVTG KWYK 
ITGVEVRPSKRITISHVGRFHKIiVIDDVRPEDEGDYTFVPDGYAL 
GSLSAKLNFIjEI KVEYVPKQ\EPP KI PLGFASGGKTSENAD/IV 
WAGNKLRLDV\SITGEAPSPFAT\NLKG\DEVFTTTEGRTRIE 
KR VDCSS FVIESAQRED EG RYTIKVTNP I GEO VASIFLQVVD VP 
DP PEAVR I TS VGEDW A I1*VWEPPMYDGGK P VTG YIiVERKKKGSQ 
RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSQPSMN 

TKPFM p IAPTS3 plhl i vedvtdttttlkwrppnrigaggidg y 
lveyclegs eenvpantep vercgftvknlptgari lfr wgvn 
iagrsepatlaqpvtireiaeppkirlprhlrqtytricvgeqln 
lwpfqgkpr pqvvwtkggapldtsrvhvrtsdfdtvffvrqaa 
rsdsgeyelsvqibnmkdtatirirwekagppinvmvkevwgt 
nalvewqapkddgnseimgyfvqkadfcktmemfnvyernrhtsc 
tvsdl ivgneyyfrvytbnicglsdspgvskntarilktgitfk 
P FE vtkehdfrmap kflt pl idr wvag ysaalncavrghp kp kv 

VWMKNXMEIREDPKFLI TNYQGVLTLWrRRPSPFDAGTYTCRAV 
NELG EAliAB CKLEVRVPQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


AnU.no acid Sfcatnenh COntaininn o<5rrT>=a"l nanh 4 Ha ' 

(A= Alanine, c=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine. G=Glycine, 
H=Histidine, I=Isoleucine, K=Lys.ine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S»Serine, T-Threonine, V^Valine, 
W«Tryptophan, Y=Tyrosine, X«UnJcnown, **Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleoside insertion) 


5443 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSPRRSRSAAEPA 
MALSMPLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWLKGWFSVTTVDLKRKPADLQNIiAPGTHPPFITFNSBVKTDV 
NKIEEFLBEVLCPPKYLKLSPKHPESNTAGMD1FAKFSAYIKNS 
RPEANBALERGtjLKTLQKIiDBYIiWS PIjPDBI DENSMEDIKFSTR 
KFLXX5NEMTLADCNLLPKIiHIVKWAKKYRNFT>IPKEMTC 
LTNAY SRDEFTNTCPS DKEVE I \ AYS DVAKRLHQVKSRLLKE VS 
FMSSP 


5444 


2 


344 


5<5PldVT(^QMAKWIJU)YLSF(^RRPPPQPPTPDYTESblLRAY 

I KVEAAEMARAKALLGGPGBELEADTEYIiDP FDAQPHPAP PDDG 
YME P YDAQWVMS ELPGRGVQLYDTP YEEQDPETADGPPSGQKPR 
QSRMPQEDER PADE YDQP WBWKKDH ISRAFAVQFDS PE WERTPG 
SAKELRRPPPRSPQPAERVDPALPLEKQPWFHGPLNRADAESLL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQttS GPFPS VPEL VLHYS SR PL P VQGAEHLALLYP WTQTP * Q 
* PDWGDRRPNGQ VATG LPE LWGAEAPSAAAHPGLHRERHPEGLP 
RAEKPGLRGPLLGLREPLGAGPRGPWGLQEPRRCQVWFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


5445 


2364 


466 


ILSRGFLGSVEICIQLPLPASEPVliLIiTWARRRWRETRSRREPT 
TLRAQSVCPWWI * BTRMNRS IPVEVDESE P YPSQLLKPI PEYSP 
EBESEPPAPNIRNMAPNSLSAPTMLHNSSGDFSQAHSTLKLANH 
QRP VSRQVTCLRTQVLE DS EDS FCRRHPGLGKAFPSGCS AVSE P 
AS ES VVGALPAEKQFSFME KRNQWLVS QLSAAS PDTGHDSDKS D 
V^u ja> Uoyiil'lV y Ki*y V rl KJ» KAv> LA) Jj e 1 JJJTG X D S Q P Q 
DVLGIRQLERP1.PLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMI» P PNI*S PHAP WNYHYHCPGS P DHQ VP YGHD YPRAAYQQVIQP 
ALPGQPLPGAS VRGLH P VQKVI LNYPS PWDQEBRPAQRDCSFPG 
LPRHQDQPHHQP PNRAGAPGESLECPABLRPQVPQPP S PAAVP R 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NGFQTAIDIFEDRIRGIDriKWMBRYLRDKTVMIIVAISPKYKQ 

PNAKKEH VPTWLQMTHVYS WPKNKKN I LLRLLREE BYVAPPRGP 
LPTLQWPL 


5446 


972 - 


161 


SS WS WCTGRMRKTRIjWGI»1»WMLFVSELRAATKLTEBKYELKEGQ 
TLDVKCDYTLEKFASSQKAWQIIRDGEMPKTLACTERPSKNSHP 
VQVGRIILEDYHDHGLLRVRMVNLQVEDSGLYQCVIYQPPKBPH 
MLFDRIRLVVTKGFSGTPGSNEWSTQNVYKIPPTTTKALCPLYT 
TPRTVTQAPPKS TAD VSTPDS EINLTNVTD I IRVP VFN I VI I»LA 
GGFLSKSLVFSVLFAVTLRSFVP+AHEPTRMSSDFQPHPSGSCA 
KGGGRR 


5447 


207 


617 


MXARTLSLMASLVAYDDSDSEAETEHAGSFNATGQQKDTSGVAR 
PPGODFASGTLDVPKAGAQPTKHGSCEDPGGYRLPLAQLGRSDR 
GSCPSQRLQWPGKE PQVTFP IKEPSCSS LWTSHVPASHMPLAAA 
RFKQ VXLSRNFPKS S FHAQS ESETVGKMGSS FQKKKCEDCWP Y 
TPRRL RQRQALSTETGKGKD VEPQGPPAGRAPAPL YVG PGVSEF 
I QP YLNSHYKETTVPRK VL PHIiRGHRGP VNTI QWCPVLSKSHML 
LSTSMDK^FKVWNAVDSGHCLQTYSLHTBAVRAARWAPCGRRIL 
SGGFDFALHLTDLETGTQLFSGRSDFRITTIiKFHPXDHKIFLCG 
GFSSEPHOVWDIRTGKVMRSYKAriQQTLDILFLREGSEFLSSTD 
ASTRDSADRTI I AWDFRTSAKI SNQI FHERFTCPSI*ALHPREPV 
FLAQTlTCbry LALFS TVWP YRWSRRRJR YEGHKVEG YS VGCECS PG 
GDLLVTG S ADGRVLMYS FRTAS RACTLQGHTQ ACVGTTYH PVLP 
SVLATCS WGGDMKI t7H*AFHWLSLGEA IGDLAPARG YSGPGRS Ii 
KSPSPSKSLLVLLCGRAMFQPATCPWQLPALSK 


5448 


194 


1633 


MAS KVTDAI VW YQXKIGAYDQQI WE KS VEQRE I KGLRNKPKKTA ' 
HVKPDI»I DVDIiVRGSAFAKAKPESPWTS LTTKG I VRWFFP F FF 
RWWLQVTSKVI FFWLXVLYLLQVAA I VL FCSTS SPH3 1 PLTEVI 
GPIWLMLLLGTVHCXJIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSCTDNTQEC^VQiraGTSTSHSVGTVFRDLWHAAFFLS 
GSXKAKNS 1VKS TETDNGYVS LDGKKTVKSGEDGIQNHEPQCBT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
miclfiotide 
location 
corresponding 
to £irat 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


«<— o»<a acyuiem. Luuudilillly Slgnai peptide 

<A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid. F=» Phenylalanine, G*Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L»I»eucine, M^Methionine, NoAsparagine , 
P=ProXine, Q-Glutomine, R=Arginine, 
S-Serine, ToThreonine, V= Valine, 
WaTryptophan, Y«Tyrosine, X»Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSKDTQRTITNVSDfeVS^BEGPBTGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSBSARPESETEDVLWEDLIiHCABCHSSCTSBTDVENHQINPC 
VKKBYRDDPFHQSHLPWLHSSHPGLEKISAIWEGNDCKKADMS 
VLB ISGMI MNRVNSH I PGIG YQI FGNAVSLILGLTPF VFRLSQA 
TDLEQLTAHSASBLYVIAFGSNEDVI VLSMVI I SFWRVSLVWI 
FFPLLC VAERTYKQ VGI M * TS EG VLRNRKSHHYKKHYPNEDAP K 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCrr 
SETDVENHQINPCVFCKEYRDDPFHQSHLPWLHSSHPGLEKISAI 
VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQIFGNAVSLI 
I/3LTPFVFRLS QATDLEQLTAHSASEL Y VI AFGSNEDVI VLS MV 
I IS FWRVSLVWI FFFLLCVAERTYKQVGIM 


"5449 


194 


1833 


maskvtdaivwyqkkigaydqqiweksVeqreikglrnkpkkta 
kvkpdl idvdlvrgs afakaxpes pwtslttkg i vrwffpfff 
rwwlqvtskviffwllvlyi^qvaaivlfcstssphsipltevi 

GPIWLMI^LGTVHO}IVSTRTPKPPLSTGGiaU?3iKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFPLS 
GS KKAKNS I DKS TETDNGYVS LDGKKTVKSG EDGI QNHEPQCST 
I RPEBTAWNTG TLRNGPSKDTQRTI TNVSDEVSSE EGPETG YSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSBS ARPESETEDVLWEDLIiHCAE CHS SCTS ETDVENHO INPC 
VKKEYRDDPFKQSHLPWLKSSHPGLEKrSAIVWEGMDCKKADMS 
VLEISGMIMNRVWSHIPGIGYQIFGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSASELYVIAFGSNEDVI VLSMVl ISFWRVS LVMI 
FFFLLCVAERTYKQVGIM*TSEGVLRNRK3HHYKKHYPNEDA?K 
SGTSCSSRCSSSRODSESARPESBTEDVLWEDLI.HCABCHSSCT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWLHSSHPGliEKISAI 
VWEGNDCKKADMSVLEISGMtMNRVNSHI PGIG YQI FGNAVSLI 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNEDVIVLSMV 
IISFWRVSLVWI FFFLLCVAERTYXQVGIM 


5450 


813* 


1242 


GQQFASFFG*NHPEVTVAMALTDIDLQLQFSMSQPEALLLLAAG 

PADHLLLQLYSGHLQVRLVLGQEBLRLQTPAETLLSDSIPHTW 

LTWEG WATLSVDG FLNAS SAVPGAPLB VPYGLFVGGTGTLGLP 

YLRGTSRPLRGCLHAATLNGRSLLRPLTPDVHEGCAEEFSASDD 

VALGFSGPHSLAAFPAWGTQDEGTLEFTLTTQSRQAPLAFQAGG 

RRGDFIYVDIFEGHLRAWEKGQGTVLLHNSVPVADGQPHEVSV 

HINAHRLEISVPQYPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 

HLQEHRLGLTPEATNASLLGCMEDLSVNGQRRGLREALLTRNMA 

AGCRLEEEEYEDDAYGHYEAFSTLAPBAWPAMELPEPCVPEPGL 

PPVFANFTQIiLTISPLWAEGGTAWLEWRHVQPTLDLKEAELRX 

SQVLFSVTRGAHYGELELDIIiGAQARKMFTLLDVVNRKARFIHD 

GSE0TSDQLVLEVSVTARVPMPSCLRRGOTYLLPIQVNPVNDP? 

HirFPHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFQVLGT 

SSGLPVERRDQPGEPATEFS CRELEAGSL VYVHCGG PAQDLTFR 

VSDGLQASPPATLKWAIRPAIQIHRSTGLRLAQGSAMPILPAN 

LSVETNAVGQDVSVLFRVTGALQFGELQKHSTGGVEGAEWWATO 

AFHQRDVEQGRVRYLSTDPQiniAYDTVENIjALEVQVGQEIIiSNL 

SFPVTI QRATVVWLRLEPLHTQNTQQETLTTAHLEATLEEAGPS 

PPTFH YE WQAPRKGNLQLQGTRLSDGQG FTQDD I QAGRVT YGA 

TARASEAVEDrFRFRVTAPPYFSPLYTFPIHIGGDPDAPVLTKV 

LLWPEGGEGVLSADHLFVKSLNSASYLYS T /MERPRLGRLAWRG 

TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDD I PFVATRQGE 

S SGDMAWEEVRGVFRVAIQPVNDHAPVQT I SRI FHVARGGRRLL 

TTDDVAFS DADSG FADAQLVLTRKDLL FGS IVAVDEPTRPI YRF 

TQSDLRKRRVLFVHSGADRGWIQLOVSDGQHQATALLEVQASEP 

YLRVANGSSLWPQGGQGTIDTAV1*HLBTNLDI RSGDEVHYH VT 

AGPRWGQIiVRAGQPATAFSQQ DLLDGAVLYSHNGSLS PE DTMAF 

S VEAGP VHTDATLQ VTI ALEGPLAPLBCL VRHKKI YVFQGEAAE I 

RRDQLEAAQEAVPPADI VFSVKS PPSAGYLVM VSRGALADEP PS 

U>PVQS FSQEAVDTGRVLYLHSRPEAWSDAFSLDVASGLGAPLE 

GVLVELEVLPAAI PLEAQNFS VPEGGSLTLAP P LLRVSGP YFPT 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


ci "*-u ocyiTiciiL Luiiuaijiin^ sxgnaj, peptide 
(A^Alanine, C=iCysteine, D»Aspartic Acid, S= 
Glutamic Acid, F=Phenylalanine, GoGlycine, 
H«Histidine, I«Isoleucine, K= Lysine, 
I*=Leucine, M=Methionine , N«Asparagine, 
P-Proline, Q-Glut amine, R*Arginine, 
S=»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=*Tyrosine, X»Uhknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLGLS LQVLEPPQHG PLQKEDGPQARTLS AFSWRiMVBEQLlRYV 
HDGSBTLTDSPVIWAHASBMDRQSHPVAFTVTVTjPVNDQPPILT 
TNTGLQMWBGATAp I PAEALKSTDGDSGSBDLVY TIEQPSNGRV 
VLRGAPGTEVRS FTQAQ LDGGLVL FSHRGTLDGG FPFRLSDGEH 
TSPGHFFRVTAQKQVLLS LKGSQTLTVCPGSVQP LSSQTLRASS 
S AGTD PQLIiLY RVVRGPQLGRLFKAQQDSTGE ALVNFTQ AEVYA 
GNI LYEHEMPPEPFWEAHDTLELQLSS PPARDVAATLAVAVSFE 
AACPCRPSHLWKNKGLWVPEGQRARITVAALDASNLLASVPSPQ 
RS EHD VLFQ VTQFPSRGQLL VSEB PLHAGQPHFLQSQLAAGQLV 
YAHGGGGTQQDGFHFRAHLQGPAGA3 VAG PQTSEAFA1TVRDVN 
ERPPQPQASVPLRLTRGSRAPISRAOLSWDPDSAPGBIEYEVQ 
RAPHNGFIjSLVGGGIjGP V7RFTQADVDSGRLAFVANGSSVAG1 F 

OTiKMSTirJfXQ PPT»PM*37*ZiVT"lTF.DQll T E*\/r\T DIVOT C1TT>niv T i~*t\ f r»-r 

sqqqlrwsdreepeaayrliqgpqygkllvggrptsafsqfqi 
dqgewfaftnfs sshdhfr vlauu^gvnasavvwtvrallhv 
waggpwpqgatlrldptvldagelanrtgsvprfrllegprhgr 
wrvprartepggsqlveqftqqdledgrlglevgrpegrapgp 
agds ltlelwaqg vppavas ldfate p ynaar p ys vall s vp ea 
arteag fcpes s tptoepg pmass pepavakggfls fleanmfs v 
iipmclvllllaliilpllfylrkrnktgkhdvqvltakprngla 
gdtetfrkvepgqaipltavpgqgpppggqpdpellqfcrtpnp 
alkngoywv 


-~5451 


i 


2274 

• 


RDSS EQGRTGDTLGRPSACMD ALKP P CLWRNHERG KKDRD S CGR " 
KNSEPGS PHSLEALRDAAPSQGLNFLLLFTKMLFI FNFLPSPLP 
TPALI C IliTFGAAI FLWLI TR PQP VLPLIjDLKNQS VGIEGGAR K 
GVSQKNNDLTS CCFS DAKTM YEVFQRGIAVSDNG P CLGYRKFNQ 
P YRWLS YKQVS DRAB YLGS CLLHKG YKSSPDQFVG I FAQNRPSW 
I ISELACYTYSMVAVPLYDT^GPEAI VH XVNKADIAMVI CDTPQ 
KALVLIGNVEKGFTPSLKV7 rLMDPFDDDLKQRGEXSGIEXLSL 
YDAENLGKEHFRKPVPPS PEDLS VI CFTSGTTGDPXGAM ITHQN 
x visiTtui/ir u^vVAiuviJSiJri VUUVi%l& xijPliAHMiJBRIVQAVVYS 
CGARVG FFQGD IRLLADDMKTLKPTLFP AVPRLLNR I YDKVQNE 
AKTPLKKFLLKLAVSSKFKELQKGIIRHDSFWDKLIFAKIQDSL 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTL PGDWTSGHVG V P LACNYVKLEDVADMNTFTVNNEGEVCT KG 

TrWFKGYLKDPEKTQEALDSDGWLHTGDIGRMLPNGTLKIIDRK 
KNIFKIAOGEYI APElfTRNT v NR QnPVT.OTTFi fur's* ct dcpt \rr>-\r 

VVPDTDVLPSFAAKLGVKGSFEBLCQNQVVREAILEDLQKIGKE 
SGLKTFEQVKA1FLHPEPFSIENGLLTPTLKAKRGELSKYFRTQ 
ID5LYEHIQD 


S452 


1633 


1138 


SRVPSLCLSI^LSLSPSREPVAGAPGCGTAGPPAMATLWGGLLR 
LGSLI^LSO^SVLLI^QLSDAAKNFEDVRCKCICPPYICEWSG 
HIYNKNISQKDCDCLHWEPMPVRGPDVEAYCLRCECKYEERSS 
VTIXVTI IIYLSILGLLLLYMVYLTLVEPILKRRLFGHAQLI QS 
DDDIGDHQPFANAHDVLARS RS RANVDNKVE YAQQRW KLQVQEQ 
RKSVFDRHWLS 


5453 


111 


1520 


PSXPAAVPQSAPPEPHREETVTATATSQVAQQPPAAAAPGBQAV " 
AGPAPSTVPSSTSKDR PVSOPSLVGSKEPpppjxp qr ^zesnr q n w 
PQEERSQQQDD IEELETKAVGMSNDGRFLKPDI E IGRGSFKTVY 
KGLDTETTVBVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPKI 
VRFYDSMESTVKGKKCIVLVTEl^TSGTLKTYLKRFKVMKIKVL 
RSf/CRQILICGLQFLHrRTPPIIHRDLKCDNrFrTGPTGSVKIGD 
LGLATLKRA5 FAKS VIGTPEFMAPEMYEE KYDBS VCVYAH3MCM 
LEMATSEYP YSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEI I 
EGCIRQNKDERYS I KDLLNHAFFQEETG VR VELAEEDDGEKIAI 
KLWLRIEDIKKLKGKYKDNEAI EFSFDLERNVPEDVAQEMVBSG 
YVCEGDHKTMAKAIKDRVSLIKRKREQRQL* 


5454 


111 


1520 


PS I PAAVPQSAPPE PHREETVTAXATSQ VAQQPPAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQBER£QQQDDIEBlxETKAVGMSNlX5RFLKFDrEIGRGSFKTVY 
KGLDTETT VEVAWCEliQDRKLTKSERQR FXEBAEMLKGLQHPNI 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


^w-iu. t»cyui*s*iu cujiL.aim.ny signal. peptJLde 
{A=Alanine, OCyeteine, D=Aspartic Acid, B« 
Glutamic Acid, F=Phenyl alanine, G=*Glycine, 
H=*Histidine, Ialsoleucine, K«Lyeine, 
LaLeucine, M«Mechi on ine, N«=Asparagine, 
P-Proline, Q~Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrogine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








vrfydswestvkgkkcivlvtelmtsqtlktVlkrpkvmkikvl 

RS WCRQILKGLQFLHTRTP PI 1 HRDLKCDNI P ITG PTGS VKIGD 
L0LATLKRASFAKSV1GTPEFMAPEMYEEKYD2SVDVYAFQMCM 
LEMATSEY PYS E CQNAAQ I YRRVTSGVK PAS FDKVAt PEVEQ5 1 1 
EG C IRQNKDER YS XKDLLNHAFFQEETGVR VELAEBDDGE K1A I 
KLWLRIEDI KKIiKGKYKDNEAIE PSFDLERNVPEDVAQEMVKSG 
YVCEGDWCTMAKAIKDR VSltlKRKREQRQL * 


5455 


| 1359 


377 


LTMVSPATRKSLPKVKAMDFITSTAILPLLFGCLGVFGLFRLLQ 
WVRGKAYLRNAVWITGATSGIiGKECAKVFYAAGAKLVLCGRNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
- QC FGYVD I LVNNAG I S Y RGT IMDTTVD VDKRVME INYFG PVALT 
KAIjIiPSMIXRR QGHI VAISSIQCKMS IPFRSA YAAS KHATQAFF 
DCLRAEMEQ YE I E VT V I S PG YIHTNLSVNAI TADGSRYG VMDTT 
TAOXSRSPVEVAQDVIiAAVGKKKKDVIIjADLLPSLAVYLRTIjAPG 
LFFSLMASRARKERKSKNS 


5456 


2 


2332 


CGAGL VAAGAVL VL Y PAS RAGERTRV ?3S P APSSLPLHS PGACG 
TEVDMDPQRSPliLEVKGNIELKRPLIXAPSQLPLSGSRLKRRPD 
QMEDGLEPBKKRTRGLGATTKITTSHPRVPSLTTVPQTQGQTTA 
OKVSKKTGPR CS TA IATGLXNQKPVPAVP VQKS G TSGVP PMASG 
KKPSKRPAWDUCGQLCTLWAELKRCRERTQTLDQENQQI»QDQIiR 
I^QQQVTCAI^TERTTI^GKIiAKVQAQAEQGQQBLKNLRACVLEL 
EBRLSTQEGLVQBLQKKQVELQEBRRGLMSQLEEKERRLQTSEA 
AI»S SSQAEVAS LRQETVAQAAIXTEREBRLHGLEMERRRLHNQL 
yt.jjtviaw ik vr t.KVKPVL»PGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIA 
MLVQSALDGYPVCIFAYGQTGSGKTFTMEGGPGGDPQLEGLI PR 
ALRHLFS VAQ ELSGQG WT YS FVAS YVE I YNETVRDLLATGTRKG 
QGGECE 1 RRAGPGSEELTVTNARYVPVSCEKEVDALLHLARQNJR 
AVARTAQNERS SRSHS VFQLQ I SGEHS SRGLQCGAPLS LVDLAG 
S BR LD P GLALG PGERBRLRETQAINSSLSTLGL VI MALSNKE SH 
VPYRNS KLTYLLQNSLGGSAKMLMFVN I SPLEENV3 ESLNS LRF 
ASKVEPSVLFGTAQSNRKVTKTDPDIjCVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCFIGWRAPCPRAZX 


5457 


2 


1540 


DDFVERRRWTRTTCLVRS PPHVPVCGHACSWNGGS LDPI>KGT PA 
LLRSAERLMRKVKKLRIiDKENTGSWRSFSLNSEGAERMATTGTP 
TADRGDAAATDDPAARFOVORTJCLITV'T PCTTurcDwepT Txrvttr 

APHDFQF VQKTDE SG PHS HRLY YLGMP YGSRENSLLYSE I PKKV 
RKEALLIXSWKC^riDHFOATPHHGVYSREEELLRERICRXGVFGI 
TSYDFHSESGLFLFQASNSLFHCRDGGKNGFMVSPGPGCVSPMK 
PLE I KTQCSGPRMDPKICPADPAFFS FINNSDL WVANTETGEER 
RLTFCJKQGLSNVLDDPKSAGVATFVIQEBFDRFTGYWWCPTASW 
EGSEGLKTLRILYEEVDES E VEVIHVPS PALEERKTD3YRYPRT 
GSKKPXIALKLAE FQTDSQGKI VSTQEKELVQP FSS LFPKVEYI 
ARAG WTRDGKYAWAMFLDR PQQ WLQLVLLP PAL FI PSTENEEQA 
ASLCQSCPQECPAVOGVRGGHQRLDQCS 


5458 


?642 


4022 | 


FVPGLRBPQWE PAQPSATMSAPSEEEE YARLVMEAQP EWLRAEV ' 
KRLSHELAETTREKIQAAEYGLAVl^BKHQLKLQFEELEVDYEA 
IRSEMEQLKEAFGQ AHTNHKKVAADG ES REES L IQES ASKEQ Y Y 
VRKVLELQTELKQLRNVLTN7QS ENERLAS VAQELKE INQNVE I 
QRGRL RD D IKE YKFRE ARLLQD YSELBEENISLQKQ VSVLRQKQ 
VEFEGLKHEIKRJjEEETEYLNSQLEDAIRLKEISBRQIiEEALET 
LKTEREQKNSLRKEIiSHYMS indsfytshlhvsldglkfsddaa 
epnitoaealvngfehgglaklpldnktstpkkbglappspslvs 

DLUSELNlSEIQKiKQQIiviQMEREKAGLLATLQDTQKQLEHTRG 

slseqqekvtrlten lsalr r lqas kerqtaldnb kdrdshedg 
dyyevdingpeiiackyhvavaeagblreqlkalrstheareaq 

HAEEKGRYEAEGQALTEJWSLLE^ 

dvagetqgslsvaqdelvtfseeianlyhhvc^cnnetpnrwil 

D Y YREGQGGAGRTS PGGRTS PEARGRRSP I LLPKGLLAPEAGRA 

dggtgdsspspgsslpsplsdprrbpkniynuaiirdqikhlq 

AAVDRTTELSRQRIASQEIX5PAVDKDKEALMEEILKLK3LLSTK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(AoAlanine, C=Cysteine, D=Aspartic Acid. E« 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Hiscidine, I*lsoleucine, K=Lysine, 
LoLeucine, M=Methionine, NfcAsparagina, 
P=Proline, Q=Glutamine, R**Arginine, 
S^Serine, iVThreonine , V- Valine, 
W-Tryptophan, Y^Tyrosine, X»TJnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REQITTLRTVZjKANKQTAEVAIxW^LKSKYHNEKAMVTBTMMK^ 
NELKAIJCEDAATFSSIJiAMPATRCDEYITQLDEMQRQLAAAEDE 
KKTLNSLLRMAI QQ K1ALTQRLELLELDH EQTRRGRAKAAPKTK 

patpsvshtcacasdraegtglanovfcsekhsiycd 


5459 * 


316 


1262 


RGGHRLSGMASNFNDIVKQGYVRIRSRRLGIYQRCWLVFKKASS 
KGPKRLEKFSDERAAYFRCTKKVTELNNVJaOVARLPKSTKKHAl 
G1YFNDDTSKTFACESDLEADEWCKVLQMECVGTRINDISLGRP 
DLLATGVEREQSERFNVYLMPSPNLGCYMGECALQITYEYICLW 
D VQN PR VKL IS WPLSALRR YGRDTTWFTFEAGRMCE TGEGLFI F 
QTRDGEA1YQKVHSAALAIAEQHERLLQSVKN5MLQMKMSERAA 
S LS TM VPLPRS AYWQHI TRQHS TGQLYRLQD VS SPLKLHRTETF 
PAYRSEH 


5460 


45 


2037 


RPGCRAGELSTGSRARERVRNRVSAPCGQDSRRCDPEVLRGRSP 
GLGLAEMPSCG ACTCGAAAVRL ITSS LASAQRG I SG GR I HMS VL 
GRLGTFETQ1LQRAPLRSFTETPAYFASKDGISKDGSGDGNKKS 
ASEGS5KKSGSGNSGKGGNQLRCPKCGDLCTHVETFVSSTRFVK 
CEKCHHFFVVLSEADSKKSIIKEPESAAEAVKLAFQQKPPPPPK 
KIYNYLDKYWGOSFAKXVLSVAVYNHYKRIYNNIPANIiRQQAB 
VEKQTSLTPRELEIRRREDEYHFTKLLQIAGISPHGNALGASMQ 
QQVWQQIPQEKRGGBVLDSSHDDIKLEKSWILLLGPTGSGKTLI. 
AQTLAKCLD VP FAI CDCTTLTQAG YVGEDI ESVI AKLLQDANYN 
VEKAQQGI VFLDE VD K IGS VPG I HQLRDVGGEG VQQGLLKLLEG 
TIVIJVPEIG^SRKLRGETVQVDTTNILFVASGAFNGLDRI ISRRK 
NEKYLGFGTPSNLGKGIlRiUVAAADLANRSGESNTHQDIEElCDRL 
LRHVEARDU EFGMI PEFVGRLP WVPLHSLDEKTLVQ ILTBPR 
NAVIPQYQALFSMDKCELNVTEDALKAIARIALERKTGARGLRS 
IMKKLLLSPMFEVPNSDI VCVEVDKEWEGKKEPGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 


5451 


1481 


160 


INPPPPPKSPCGRARKWRRRRRPGAPEAAVt^^SGPGPERLFD"" 
SHRLPGDCFLLLVLLLYAPVGPCLLVLRLFLG IHVFLVSCALPD 
SVLRRFVVRTMCAVLGLVARQEDSGLRDHSVRVLISNHVTP PDH 
NXVNLLTTCSTPLLNSPPSFVCWSRGFMEMNGRGELVESLKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWQPLTL 
QVQRPLVS VTVSDASWVS ELLWSLF VP FTVYQVRWLRPVHRQLG 
BANE B FALRVQQLVAKEJjGQTGTRIiTPADKAEHMKRQRH PRLRP 

qsaqss fppspqpspdvqlatzaqrvkevl phvpirfs viqrdlak 
tgcvdlt itollegavafmpeditkgtqslptasaskfps sgpv 
tpqptaltfaksswarqkslqerkqalyeyarrrfterraqeao 


5462 




3353 


KIKERQMSANNS PPSAQKSVLPTAI PAVLPAASPCSSPKTGbSA " 
RLSNGSFS APS LTMS RGS VHTVSFLLQI GLTRE S VTI EAQELSti 
S AVKDLVCS IVYQKF PECGFFGM YDKIIiLFRHDMNSENl LQL IT 
SADB IHEGDLVEWLSALATVEDFQ IRPHTLYVHS YKAPTFCD Y 
CGEyj^mhVRQGhKCEGCGL^rHKRCAFKIPNNCSGVRKRRLSU 
VSLPGPGt»SVPRPLQPEYVALPSEESaVHQEPSKRIPSWSGRPI 
WMBKMVMOT VKVPHTFAVHS YTRP'r I CQ YCKRLLKGLFRQGMQC 
KDCKFNCHKRCAS KVPRDCLGE VTFNGE P SSLGTDTD I PMD I DH 
NDINSDSSRGLDDTEEPSPPEDKMFFLDPSDIiDVERDEEAVKTI 
SPSTSNNI PI^RWOSIKHTKRKSSTM^GWMVHYTSRDNLRK 
RHYWRLDSKCLTLFQNESGSKYYKEIPLSEILR1SSPRDFTNIS 
QGSWPKCFEI ITDTM V YFVGENNGDS SHNPVLAATG VGLD VAQS 
WEKAI RQALMPVTPQASVCTS PGQGKDH KDLSTS I S VSNCQ I QE 
NVDI S TVYQ I F ADEVLGSGQ FG I VYGGKHRKTGRDVA IKVIDKM 
R FPTKQESQLRNEVAI LQNLHHPGI VNLECMFET PERVFWMEK 
LHGDMLEMILS SEKS RLPER I TKFMVTQ I LVALRNLH FKN IVHC 
DLKPEKVLIiASAEPFPQVKLCDFGFARI IGEKSFRRSWGTPAY 
LAPEVLRSKGYNRSLDMWSVGVIIYVSLSGTFPFNEDEDINDQI 
QNAAFMYPPNPMRJBISGEA3DI*IWNLLQVKMRKRYSVDKSX*SHP 
WLQDYQTWLDLREPETRIGBRYITHESDDARWEIHAYTHNLVYP 
iGiFIMAPNPDDMEBDP 


5463 


237 


1012 


LI^VTMTTSRCSHLPEVI»PDCTSSAAPVVKTVEDCGSLVNGQPQ 
YVMQVSAKDGQLLSTWRTLATQS PFNDRPMCRI CHBGS5QEDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
locat ion 
corresponding* 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AWUanina, C*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=*Glycine, 
H=Histidine, I«Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P-Proline, Q=Glutaraine, R=Arginine, 
S«=Serine, T«Threonine, V-Valine, 
W= Tryptophan, ^-Tyrosine, X»CTnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LS PCECTGTLGT I HRSCLBHWLSSSNTS YCELCHPRFAVERKPR 
PLVEWLRNPGPQHEJCRTLFGDMVCPLFJTPLATISGWIjCLRGAV 
DHLHFSSRLEAVGLIALTVALFTIYLFWTLVSFRYHCRLYNBWR 
RTNQRVI 1*1*1 PKSVNVPSNQPSLLGLHSVKRNSKETW 


5464 


195 


677 


S PS MNPRKK VDLKJj 1 I VGAIG VGKTSLtiHQ YVHKTFYE^ E YQTTIj " 
GASILSKIIILGDTTLKLQIWDTGGQERVRSMVSTFYKGSDGCI 
LAFDVTDLESFEALDIWRGDVLAKIVPMEQSYPMVLLGNKIDIA 
DRKYQSILENHLTESIKDSPDQSRSRCC 


5465 


5278 


334B 


kgdprefirVhrealecdWsa^lhew idli pgykqqgpaavra 

VNVFHHLFYEGQVDIYNINDPLKETATIGFINNFGQIPKQI.FKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDKLRPSBTPV 
KEliKEPVGQI VCTDKGILAVEQNKVt*I PPTWNKTFAWGYADLSC 
RLGTYESDKAMTVYECLSEWGQILCAI CPNPKLVITGGTSTWC 
VWEMGTSKE KAKTVTLKQALLGHTDTVTCATAS LAYHI X VSGSR 
DRTCIIWDLNKLSFLTQLRGHRAPVSALCINSLTGDIVSCAGTY 
IHVWSINGrfPrVSVNTFTGRSQQXlCCCMSEt4NEWDTQMVIVTG 
HSDGWRFWRMEFLQVPETPAPEPAEVLEMQBDCPEAQIGQEAQ 
DEDS SDSEADEQS I S QDPKDTPSQPSSTSHR PRAAS CRATAAWC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNP I EVRNYSRliKPG YRWERQIiVFRSKIiTMHTAFDRKDNAHPA 
EVTALG IS KDHS RILVGDS RGRVFS WSVS DQPGRS AADHWVKDE 
GGDSCSGCS VRFSLTERRHHCRNCGQLFCQKCS3FQSE I KRLKI 
SSPVRVCQNCYYNLQHERGSEDGPRNC 


5466 


3 


992 


HACAHASAHASGRLVRWWRKRRSVMGIQTSPVLIASIK3VGLVTL 
I/3LAVGSYLVRRSRRPQVTLLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVI KVYLKG VHPKFPEGGKMSQYLDSLKVGDVVE FRGPSG1* 
LTYTGKGHFNIQPNKKS P PEPRVAKKLGM IAGGTGITPMLQLIR 
AILKVPEDPTQCFLLFANQTEKDIILREDLEELQARYPNRFKLW 
FTLDHPPKDWA YSKGKVTADM I REHL? APGDDVL VLLCG P PPMV 
QLACH PNLDKLG YSQKMR FTY 


5457 " 


2103 


4 


GEAI*RVGTRGCRRDLPDPQARIFIQKKDLEEDESVTAAHLKSRa 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEEWECLQPDQRTL 
YRDVMLENYSHLISLAGSSISKPDVT TLLEQBKEPWMWRKETS 
RRYPDLELKYGPEKVSPENDTSEVNLPKQVIKQISTTIiGIEAFY 
FRWDSE YRQFEGLQG YQEGNI NQKM I S YEKLPTHT PHASL ICNT 
HKPYECKECGKYFSCGSNLIQHQSIHTGEKPYKCX2CGKAFQLH 
IQLTRHQKFHTGEKTFECKKCGKAFKLPTQLNRHXNIHTVKKLF 
ECKBCGKS FNRSSNLTQHQS IHAGVKPYQCKECGKAFNRGSNLI 
QHQKIHSNEKPF VCKEOGMAFRYHYQL IEHCQIHTGEKP FECKE 
CXSKAFTLLTKLVRHQKIHTGEKPPECRECGKAFSLIiNQLNRHKN 
IHTGEKPFECKECGKSFNRSSE5LVQHQSIHAGIKPYECKECGKG 
FNRGAHLI QHQKIHSNEKPF VCR BCEMAFRYHCQLI EHSRIHTG 
DKPFECQDCGKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRS IHTGKKPF 
ECKECGKAFIUjHMHLIRHQKLHTGEKPFECKECGKAFRLHMQLI 
RHQKLHTGEKP FECKECG KVFSLPTQLNRHKN IHTGEKAS 


54 6 B 


225 


2976 


SFLTDLFQSLAQLENLCKQLYETTDTTTRLQAEKALVEFTNSPD ' 
CLSKCQLLLERGSS S YSQLLAATCLTKLVSRTNNPLPLEQRIDI ' 
RNYVLNYLATRPXLATFVTQALIGLYAR XTKLGWFDCQKDDYVF 
RNAI TDVTRFLQDS VE YC I IGVTILS QLTNBINQ VSATAFL I EA 
DTTHPLTKHRKIASSFRDSSLFDIFTLSCNLLKQASGKNLNM) 
ESOHGLLMQLLKLTHNCLNFDPIGTS TDESSDDLCTVQI PTS WR 
SAFLDSSTLQLSTI GRCEY EKTCALIjVQLFDQSAQS YQBLLQS A 
SASPMDIAVQEX3RLTWLVYIIGAVIGGRVSFASTDEQDAMDGBL 
VCR VL QLMNL TDS RLAQAGNE KLELAMLS FFEQFR KIYIGVQVQ 
KSSKLYRRLSEVLGLNDETMVLSVFIGKI ITNLKYWGRCEPITS 
KTLQLLNDLS IG YS S VRKLVKi&AVQPMLNNHTSEHFS FLGINN 
QS^TDNRCRTTFYTALGRLLMVDLGEDEDQYEQFMLPLTAAFE 
AVAQMFSTNSFNBQEAKRTLVGLVRDLRGIAFAFKAKTSFMMLF 
EWIYPSYMPII^RAIELWYHDPACTTPVLKLMAELVHNRSQRLQ 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
locat ion 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


(AsAlanine, C=*Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, Glycine, 
H«Hiatidine, I=Isoleucine, K» Lysine, 
L*Leucine, MaNethionine, N«Asparagine, 
PoProline, Q«Glutamine, R^Arginine, 
S -Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyros±ne, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FDVS SPNG I LLFRETS KM I TM YGNRI 1»T LGE VP KDQ VYALKLKG 
IS I CFSMLKAALSGS YVNFGVFRLYGDDALDNALQTPI KLLLS I 
PHSDLLDV PKLSQS YYSLLEVLTQDHMNFIASIjEPHVIMYI LSS 
ISEGbTALDTMVCTGCCSCLDHIVTYIiFKQLSRSTKKRTTPLNQ 
BSDRFLH IMQQHPEMI QQMLS TVLCTI 1 1 FEDCRNQW SMS R PLLG 
LILLNEKYFSDLRNSIVNSQPPEKQQAMHLCFENLMEGIERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


2653 " 


uuefkTsLVpwhlpmgwlcsgllfpvsclvllqvassgnmkvlq 
b ptc vsdyms 1st cewkmngptncs telrllyqlvflls bahtc 
vpenkggagcvchllmddwsadnytldlwagqqllwkgs fkps 
ehvkprapgnltvhtnvsdtllltffsnpyppdnylynhltyavn 

IWSENDPADFRIYNVTYLEPSLRIAASTLKSGISYRARVRAWAQ 
CYJfTTNSEWS PSTKWHNS YREPFEQHLLLGVSVS CIVILAVCLL 
CY VS I TKIKKE W WDQ I PBfPARSRLVAII I QDAQG SQWE KRS RGQ 
EPAKCPHVJKNCLTKIXPCFLEHNMKRDEPPHICAAKEMPFQGSGK 
SAVfCPVEISKTVLWPESISWRCVELPEAPVECESEEBVEEEKG 
SFCASPESS R DDFQEGREG I VARLTES LFLDLLGEENGGFCQQD 

MGRSCTiT»PP9f!QXQ2VMMDtJ r n?'iri>oivr , oiri?ATinrji^-y^»/NT\T >n _ 
uuDotuurfouii aHrinirViiJ&tf roAo_ KcAPPWGKEQPLHLEPS 

PPASPTQSPDNLTCTETPLVIAGNPAYRSFSNSLSQSPCPRELG 
PDPLIJ^LEEVEPEMPCTPQLSEPTTVPQPEPETWEQILRRSFV 
LQHGAAAA P VS A P TS GYQ B F VHAVEQGGTQASAWGLG PPG BAG 
YKAFSSLIASSAVSPEKCGFGASSGEEGYKPFQDLIPGCPGDPA 
PV PVPLFTFGLDRE P PRS P Q SSHL PS S S PEKLGLE PGEKVEDMP 
KPPLPQBQATDPLVDS LCS G t VYSALTCHLCXIHiiKQCHGQEDGG 
QTPVMASPCCGCCCGDRASPPTTPLRAPDPSPGGVPLEASLCPA 
SLAPSG I SEKSKSSSSFHPAPGNAQSS SQTTPKIVNFVSVGPTYM 
RVS 


5470 


17 


1418 


tacrirtslnrglaavkkjjavemlasVgiaVslmkfftgpmsdf'"* 

iu*vvhjv r vwaiu<uKIlvAVJuvJnV VAGAI AAVFHTL IAYSDLGY YI 
INKLHHVDESVGSKTRRAFLY1JAFPFMDAKAWTHAGILLKHKY 
S FLVGCAS ISDVI AQWFVAI LLHSHLECREPLLI P ILSL YMG A 
LVRCTTLCLG YYKNIHDI I PDRSGPELGGDATI RXMLSFWWPtA 
L ILATQRI SRP IVNLFVSRDLGGSSAATEAVAl LTATYP VGHM P 
YGWLTErRAVYPAFDKNm>SNKLVSTSrnVTAAHIKKFTFVa4A 
LSIiTLCFVMFWTPNVSEKJLIDIIGVDFAPAELCWPLRIFSFF 
? Vp VTVRAHLTGW LMTLKKT F VLAP SS VLRI IVL IAS LWLPYL 
GVHGATLG VGS LLAGFVG E S TMDAIAAC YVYRKQ KKKM ENE S AT 
EGEDS AMTDMPPTE2 VTD I VEMREENE 


5471 " 


1868 


558 


G PGV PG B VEMVKGQ P FDVG PRYTQLQ Y I G EGAYGMVS SAYDHVR 
KTRVAI KKI S PFEHQTYCQ RTLRE I Q I L LRFRHENV I G I RD I LR 
AS TLBAMRDVYI VQDLMETJDL YKLLKSQQLSNDHI C YFLYQ I LR 
GLKYIHSANVXHRDLKPSNU.INTTCDLKICDFGLARIADPEHD 
HTGFLTE YVATRW YRAPE IMLNS KGYTKSIDIWSVGCILAEMLS 
NR PIFPG KHYLDQLMHILG ILGS PSQBOLMCI INMKARNYLQSL 
PSKTKVAWAKLFPKSDSKALDIiLDRMLTFNPNKRITVEEALAHP 
YLEQYYDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFQ 
PGVLEAP 


5472 " 


1469 


753 


L YVMARYLSDEEVAVS IDRLCKANGRS PS IPFGTVRIPGRARVR 
DPQALWIFGYGSLVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 
GSDKMPGRWTLLEDHEGCTWGVAYQVQGEQVSKALKYLNVREA 
VLGGYDTKEVTFYPQDAPDQPLKAIiAYVATPQNPGYLGPAPBEA 
IATQILACRGFSGHNLEYI;IiRVRI5VMQI,CGPQAQDEHI*AAIVDA 
VGTMLPCFCPTEQAIiAI*V 


5473 


3 


2119 


FWNVKLLIQDLEDIEQRVPVMDAQYKIITKTAHliITKESPQBEG " 
KEMFATMS KLKEQLTKVKEC YS PLL YESQQLL I PLBELEKQMTS 
FYDSLGK INE I ITVLEREAQS S ALFKQKHQELLACQENCKKTLT 
LI EKGSOSVQKJFVTLSN\^iKHFDQTRLQRQIADIHVAFQSMVKK 
TGD^KHVETNSRlJviKKFEESRAELBKVLRIAQEGLEEKGDPEE 
LLRRHTEFFSQLDQRVLNAFLKACDBLTDI LPEQBQQGLQEAVR 
KLHKQWKDIX2GEAPYHLLHLKIDVEKNRFLASAEECRTEI,DRET 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Predicted end" 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
reoiduc of 
amino acid 
sequence 


ocymcjiu v,unc.cixning signal peptide 
<A=Alanine, C-Cysteine. D=Aspartic Acid, E=» 
Glutamic Acid, P= Phenyl alanine, G=»Glycine, 
HaHistidine, I=Isoleucine, K-Lysine, 
L»Leucine, M=Mathionine, N-Asparagine, 
P=Proline, Q-Glutamine, R^Arginine, 
S-Serine, T«Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X«Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








KLMPQEGSEKT I KEKRVFFSDKGPUHLCEKRLQL IEELC VKLPV"" 
RDPVRDTPGTCHVTLKBLRAAIDSTYRKLMEDPDKWKDYTSRFS 
EF8SWISTNETQLKGIKGSAIDTANHGEVKRAVBBIRNGVTECRG 
ETLSWLKSMiKVLTEVS SENEAQKQGDELAKLSSSFKALVTIjLS 
EVE KMLSNFG D CVQYKE IVKNSLEEL I SGSKEVQEQABXI LDTE 
NLFEAQQLLLHHQQKTKRISAKKRDVQQQIAQAQQGEGGLPDRG 
HEEI»RKLESTLDGLERSRERQE RR I OVTLEKW Eift PPTM VT?T\n7u 

YLPQTGSSHERPLSPSSLESLSSELEQTKEFSKRTESIAVQAEN 
LVKEAS EIPLGPQNKQLLQQQAKS I KEQVKKLEDTIjEEEYVTDK 
S 


5474 


2 


780 


TPDVRQLQASRRGIAVASWCSPRWRafiPFMayvirg^iJT t orto^v — 

LKRWKKNWFDLWSDGHL1YYDDQTRQWIEDKVHMPMRCTNIRTG 
QECRDTQPPDGKS KDCMLQ I VCRDGKTI SLCAESTDDCLAWKFT 
LQDSRTNTAY VGS AVMTDETSWS S PPP YTAYAAPAPE VGRTLS 
LQQAYGYGPYGGAyppGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI IRERYRDNDSDLALGMLAGAATGMALGSLFWVP 


547S 


2 


506 


ARGWLESLSLTCQTTPPPSSPCLLHSPSTPIHTMPPNLTGYYRF 
VSQKNMEDYLQALNI SLAVR KIALLIiKPDKEI BHQGNHMT VRTI> 
STFRNYTVOFDVGVEFEBDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVPRKVR 


" S47S 


192 


1457 


suan3iiijuv,r v a aJvivviSSljKtiSWjyiiTS iHQYIiVDEPTLSWSR 
PSTRASEVLCSTKVSHYBLQVB1GRGPDNLTSVHLARHTPTGTL 
VTIKITNLENCNEERLKALQKAVILSHFFRHPNITTYKTVFTVG 
SWLWVX S P FMAYGSASQLLRT YFPEGMSETLIRNI ZiFGAVRGIiN 
YLHQNGCIHRS I KASHILISGDGLVTLSGLSHLHSIiVKHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGITACEL 
ASGQVPFQDMHRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 
SGVDSG 1GESVLVSSGTHTVNSDRLHTPSSKTFS PAFFSLVQLC 

LQQDPEKRPSASSUiSHVPPWATMTWRRCriT^CTT CT t nmtwi/nn<r 

S LPP VLPWTE P E CD FPDEKDS YWEF 


5477 


3 


1044 


RGNSRLRYSHEDELQLPRLPELFBTGRQLLDEVBVATEPAGSRI 
VQEKVFKGLDLLEKAAEMIiSQLDLFSRNBDIaEE IAS TDLKYJUL V 
PAFQGALTMKQVW PS KRLDHLQRAREHF IN YLTQ CHCYHVAEF3 
LPKXMNNS AENHTANSSMAYPS LVAMAS QRQAKI QRYKQKKELE 
HRLS AMKS AVESGQADDBR VRE YYLLHLQRWI DI SLEE I ES I DQ 
EIKILRERDSSREASTSNSSROKRPPVffPPTT TO"MMar»ainn?nj\ 

GYPSLPTMTVSDWYEQHRKYGALPDQG I AKAAPEEFRKAAQQQE 
EQE EKEE EDDEQTUIRAREWDD WKDTH PRG YGNRQNMG 


5478 


2 


835 


KTVR I WPlfVKG ESTVFRAHTATVRS VHFCSDGQS FVTASDDKT 
V1CVWATHRQKFLFSLS0HINWVRCAKFSPDGRLIVSASDDKTVX 
LWDKS SRE CVHS YCEHGGFVTYVDFHPSGTC IAAAGMDNTVKVW 
DVRTHRLLQH YQLHS AAVNGLS FHP SGN YL I TASS DSTLKILDL 
MEGRLLYTLHGHQGPATrVAPSRTGEYFASGGSDEOVMVWKSN'P 
D I GDHGEVTKVP RPPATLAS SM GNLTVSILEQRLTLEEDKLKQC 
LBNQQliIMQRATP 


5479— 


2 


835 


KTVRIWVPWKGESTVFRAHTATVRSVHFCSDGQS FVTASDDKT" ' 
VKVWATHRQ KFLFSLSQHINWVRCAKFS PUGRLI VSASDDKTVK 
L WDKSSRECVHSYCEHGGFVT YVDFHPS GTCI AAAGMDNTVKVW 
DVRTHRLLQHYQLHS AAVNGLS FHPSGNYL ITASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFS RTGE YPASGGSDEQ VMVNKSN P 
D IGDHGEVTKVPRPPATLASSMGNLTVS I LEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5480 " 


444 


1952 


LSIiTSRMBEAELVKGRLQAITDKRKIQEEISQKRLKIEEDKLKH 
QHLKKKALR EKWLLDG IS SGKBQEEMKKQNQQDQHQ I QVLEQS I 
LRLEKEIQDLEKAELQISTKEEAILKXLKSIERTTEDIIRSVKV 
EREERAEES IEDIYAN I PDLPKSYI PS RLRK3 tNEEKFJDDEQMR 
KALYAMEIKVEKDLKTGESTVLSSIPLPSDDFKGTGIKVYDDGQ 
KSVYAVSSNHSAAYNGTDGLAPVEVEELLRQASERNSKSPTEYH 
EPVYANPPYRPTTPORSTVTPGPNFQERIKIKTNGLGIGVNESI 
HNMGNfGLSEERGNNFNHI S PIPPVPH PRS V I QQABE KIiHTPQKR 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
frorrespondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


(AsAlanine, CsCysteine, D=Aspartic Acid, E* 
Glutamic Acid, P= Phenyl alanine, G^Glycine, 
H«Histidine, Iolsoleucine, K» Lysine, 
L*»Leucine, M=Methionine, Asparagine, 
P=Pxoline, Q=K3lu tannine, R=Arginine, 
SaSerine, T»Threonine, V=Veline, 
W^Tryptophan, YsTyrosine, X=UnJcnown, *»StOp 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMTPWEESNVMQDKDAPSPKPRLS PRBTI FGKSEHQNSSPTCQE~~ 
DEEDVRYNrVHSLPPDINDTEPVTMIPMGYQQAEDSESDKKFLT 
GYDGIIHAELWIDDEEEEDEGEAEKPSyHPIAPHSQVYQPAKP 
TPLPRKRSEASPHEKHKS 


5481 


3 


1422 


NSPGSVCLCQCVCPSLLHCLPPLLLLLLLPIiLLHESPQPPALRV 
VATSSDIWFM^^Q^QKPVLTGQRF)CTPJCIlDEKEKFEPTVPRDTLV 
QGLNEAGDDLE AVAXFLDSTGS RLD YRR YADTLFDI LVAGSMLA 
PGGTRIDDGDKTKMTNHCVFS ANEDHE T IRN YAQVFNKLIRR YK 
Y LE KAFBDEM KKLLLFLKAFS ETEQTKLAMLS GILLGNGTLPAT 
1 LTSLFTDSLVKEG IAAS FAVKLFKAWMAEKDANS VTSSLRKAN 
LDXRLLELFPVNRQSVDHFAKYFTDAGLKELSDFLRVQQSLGTR 
KELQKELQERLSQECP I KEVVLYVKEEMKRNDLPETAVIGLLWT 
CIM^VEWWKKEELVAEQALKHLKQYAPLLAVFSSQGQSELI Ll> 
QKVQE YCYDWI HFMKAFQKI WLFYKADVLS EEA1 LKWYKEAHV 
AKGKS VFLDQMKK FVE WLQNAE EES ESEG E EN 


5482" 


1492 


528 


THWMTGMC YAPHQ VLS Y I NG VTTS KPG VS LVYSM PS RNLSLRL " 
EGLQEKDSGP YS CS VNVQDKQGKS RGBS I KTLELNVLVPPAP PS 
CRLQGVPHVGANVTLS CQSPRSKPAVQYQW DRQLPS FQTFFAPA 
LDVIRGSltShTNLiS SSMAGVYVCKAHNEVGTAQCNVTLEVSTGP 
GAAWAGAWG TLVGLGL1AGLV L"L Y~4R R RTf AT.R PD awn T VTm* 
IAPRTLPWP KSSDT I S KNGTLS SVTS ARALRPPHGPPRPGALT P 

TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


5483 


1 


788 


FFFFKGCRAGRGNESDYRKLEEMHQRFLVSERSKDDLOIiRLTRA 
ENRIKQLETDSSBEISRYQEMIQKLQNVLBSERENCGLVSEQRL 
KLQQENKQLRK3TE SLRKIALE AQKKAKV K IS TKEHEFS I KERG 
FEVOLREMEDSNRNS I VELRHLLATQQ KAANRW KEETKKLTESA 
Rl H TNNLKSELS RQKLHTQELLSQLEMANEKVAENEKLI LEHQB 
KANRLQRRLSQAEERAASASQQLSVITVQRRKAASLMNLENI 


5484 


3 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS " 
ESDQDERGDSGQPSNKBLFGDDSEBEGASHHSGSDNHSERSDNR 
SEASBRSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 
AEGSEKAHSDDBKWGRBDKSDQSDDEKrQNSDDEERAQGSDEDK 
I£NSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 
SDNDDB KQNS DDEEQPQLSDBEKMQNSDDERP QAS DEEHRHS DD 
EEEQDHKSESARGSDSBDEVLRMKRKNAIASDSEADSDTEVPKD 
NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 
PIPETRIEVB I PKVNTDLGNDLYFVKLPNFLSVE PRPFDPQYYE 
DE FED EEMLDEEGRTRLKLKVENTIR WRI RRDEEGNEIKBSNAR 
I VKWS DGSMSLHLGNE VFDVYKAPLQGDHNHLF IRQGTGIiQGQ A 
VFKTKLTFRPHSTDS ATHRJCMTL5LADRCSKTQ KI R1I»PMAGRD 
PECQRTEMI KKEEERLRAS I RRESQQPJRMREKQHQRGLSAS YLE 
PDRYDEEEEGEES I S LAA I KNR YKGGI RE ERAR X YSSD5DEGS E 
EDKAQRLLKAKKLTSDE VRPNLFNSRGLS CTQE PTALNBELTDQ 
AGTN 


5485 


161 


1074 


KRK I LSSMMDSEAHEKR P P ILTS SKQD I S PH ITNVGEM KH YLCG ™~ 
CCAftFNNVAITFPIQKVLFRQQIjYGl KTRDAILQLRRDGFRNLY 
RGILPPLMQKTTTLALMFGLYEDLS CLLHKHVSAPEFATSGVAA 
VLAGTTRAI FTPLERVQTLLQDHKHHDKFTNTYQAFKALKCHG I 
GEYYRGLVPILFRNGLSNVLFFGLRGPIKEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFPINWKTRIQSQIGGEFQSFPKVFQXI 
WLERDRKLINLFRGAHLNYHRSLISWGIINATYEFLLKVI 


54BS ' 


1404 


142 


I PGSTI SWSPAAARGLSVCRCCRLHPASAMDLFGDLPEPERS PR 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GSIATS ISQMVKTEG KGAKR KTS EEEKNGS EEL VEKKVCKASS V 
IFGLKGYVAERKGEREEMQDAHVILNDITBECRPPSSLITRVSY 
FAVFDGHGGIRASKFAAQNLHQNLIRKFPKGDVISVEKTVKRCL 
LDTFKHTDEBFLKQASSQKPAWKDGSTATCVLAVDNILYIANLG 

DSRAI LCRYNEBSQKHAALSLS kehnptqyeermriqkaggnvr 
DGRVLGVLEVSRSIGDGQYKRCGVTSVPDIRRCQLTPNDRFILL 
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SEQ — 

ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


"a cuxu ecu ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Alanine, (^Cysteine, D=Aspartic Acid, B« 
waulcwiui, nuiu, r a iriieiiyx»jpCiiiine # us^jiycme, 
H«Histidine, 1= Is ©leucine, K=sT.»ysine, 
LsLeucine, M=Methionine, N^Asparagine, 
P=Proline ( Q&Glut amine , R*»Arginine/ 
S=Serir.e, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknoum, *oStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGLFKVFTPEEAVNFILSCLEDBKIQTRBGKSAADARYBASC" 
NRLANKAVQRGS ADNVTVMWR I GH j 


5487 


" " £*5 ■- 


182 


AVSLEOIRGLOTPAPVPliPr/}Pf'PQW(^MFPV'?r.fcT.T.T K&Ht Tn 

LEANDPFANKDDPPyyDWKNLQLSQLICGGLLAIAGlAAVLSGK 
CKCK33QKQH3 ? VPEKAI PLITPGSATTC 


5488 


1072 


259 


AMAASGBPQRQWQEEVAAWWGSCMTDLVSLTSRLPKTGETIH 
GHKFFI GFGGKGANQCVQAARI^AMTSMVCKVGKDS FGNDYIEN 
LKQNDI STE FT YQT KDAATGTAS I IVNNEGQNI I VIVAGANLLL 
NTEDL3AAANV I SRAKVM VCQLE I T PATSLEALTMARRS GVKTL 
r«FAFAJJUJI*DP{2rTc II^SDVFCCNESEABILTQLT VGoAADAGE 
AALVLLKRG CQWI ITIiG ASGCWLSQTE P E PKH I PTEKVKAVD 
TTVSFKI 


S489 


81 


893 


GKGPVAAFIDQSNIFLTDPXIFLGQWREEPKMPLLLLGETEPLK 
LERDCRSPVBPWAAASPDLAtACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS IAIRKKQQE WGFLEA5JKI DFKELD 
IAGDBDNRRWMRENVPGEKKPQNGIPLPPQIFNB3QYCGDFDSF 
r bAK±.bN± I y s FliGLAPP PDSKGSEKAaEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEET3EIAMEGAEGEAEBEEETAEGEEP 
GEDEDS 


5490 


81 


893 


GKGPVAAFIDQSNI FLTDPKI FLGQWREEPKMPLLLLGBTEPtk 
LERDCRSPVEPWAAASPDLALACLCHCQDLSSGAFPNRGV1CGV 
LFPTVEMVIKVFVATSSGSIAIRKXCX3BWGFLEANKIDFKELD 
I AGDEDNRRWMRENV PGB KKPQNGI PLP PQ IFNEEQYCGDFDS F 
FS AKEEN 1 1 YSFLG LAP P PDSKGS EKAESGGETE AQKEGSEDVG 
NLPB AQEKNEEEGETATEETEE I AMEGAEGEAEEEEETAEGEE P 
GEDEDS 


5491 


204 


1194 


GSAPRLS LG PTG AQARDPD W WARPPS RP YTQS KEDRPDTEGRS E " 
QX3DMASSFLPAGAITGDSGGELSSGDDSGEVEFPHSPEIEBTSC 
LAELFEKAAAHLQGLIQVASREQIiLYLYARYKQ VKVGNCNTP KP 
S FFD FEG KQKWE AWKALGDSS P S QAMQE Y IAWKKLDPGWNPQI 
PEKKGKEANTGFGGPVISSIiYHEETlREEDKNI FDYCRENNI DH 
ITKA1KS FOWDVNVKDEEGRALLHWACDRGHKELVTVIiLQHRAD 
IN0QDNEGQTAI»HYASACEFLDIVEI)I»L»QSGADPTLRDQDGCLP 
EEVTGCKTVSLVIiQRHTTGKA 


5492 


3 


1896 


AS KNPIiSAVCTTG IMS SLAVRDPAMDRSLRS VFVGN I PYEATEE 
QLKD I FSE VGS WSFRIfVYDRETGKPKG YGFCE YQDQETALS AM 
RNLNGREFSGRALRVDNAASEKNKEELKSLGPAAPI IDSPYGDP 
I D PEDAP ESI TRAVAS L PPEQMFELM KQMKL CVQNSHQEARNML 
I^NPQLAYALLQAQVVMRIMDPEIALKILHRKIHVTPLIPGKSO 
*!V c ?V^nPrtP/lDrtPnt.f*DftPN\rT.T.tcmnTJO'DJit>nnmrT ^DDDtrvnT 

PPLMQTP IQGGI PAPGP IPAAVPGAGPGSLTPGGAMQ PQLGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTLLSVTGEVEPRGYLGPPHQGPPMHHASGHDTRGPSSHEMRG 
G PLGDPRLLI GEPRG PM I DQRGLPMDGRGGRDSRAM ETRAMETE 
VLETRVMERRGMETCAM ETRGME ARGMD ARGL EMRG PVPSSRGP 
MTGGIQGPGP INIGAGG P PQGPRQVPGI SGVGNPGAGMQGTG I Q 
GTGMQGAG I QGGGMQGAG I QGVS I QGGG I QGGGI QGAS KQGGSQ 
PSSFSPGQSQVTPQDQEKAALIMQVLQLTADQIAMLPPEQRQSI 
LILKEQ IQKSTGAS 


5493 


1 


1876 


RAPMMTKAVPEEPRKPGRLTgALNSPLTWEHVWICVPGGTPDCL 
TDTFRVKRPHLRRSASNGHVPGTPVYREKEDMYDEI IELKKSLH 
VQKSDVDI^IRTXLRRLEEENSRKDRQ IEQIiLDPSRGTDFVRTIA 
BKRPDASWVINGLKQRILKLEQQCKEKDGTISKIiQTDMKTTNLE 
EMRIAMETYYEEVHRIKyrLLASSETTGKKPIXJEKKTCAKRQ^ 
GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 
SKPRIJjRJII VELEKKLS VME SS KSHAAEPVRSHP PACLAS SS AL 
HRQPRGDRNKDHERLRGAVRDLKEERTALQE<3LLQRDLEVXQLL 
QAKADLEKELECAREGE BERRERE EVI»REB IQTLTSKLQ ELQEM 
KKE EKEDCPBVPHKAQELPAPTPS SRHCEQDWP P DSS BBGLP RP 
RSPCSDGRRDAAARVLQAQNK^nflGiKKKKAVLDEAAWIKiAAFR 
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SEQ 
ID 
NO: 


"Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A*»Alanine, OCysteine, DsAspartic Acid, E» 
Glutamic Acid, P= Phenyl alanine, G«Glyc.ine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valinc, 
WoTryptophan, Y-Tyrosine, X=Unknowr., *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








GHLTRTKLLASKAHGSEPPSVPOriPDQSSPVPRVPSPiA^ATGS — 
PVQEEAIVI IQSALRAHLARARHSATGKRTTTAASTRRRSASAT 
HGDAS S P P FLAALP DPS PSGPQAVAP L PGDD VNSDDSDD I VIAP 
SLPTKNFPV 


" 5494 


71 


536 


RS KAKIGTPTREVPSTDMXVRRESSSSLTHRPAPSPATPRLLGT 
RRVLLGVSEGTGCADAMEt»VLVFLCS UAPMVLASAAEKEKBKD 

PFHYDYQTLRIGGLVFAWLFSVGILLILSRRCKCSPNQKPRAP 
GDEEAQVENLI TANATE P QKAEN 


5495 


273 


2168 


DSLLLIQVu'j^PFTiHLRSRLPSAIRSLILQKKPWIRNTSSMAG 
BLRPASLWItPRSLAPAFERFCQVWTGPLPIiLGQSEPEKWMLPP 
QGAISETRMGHPQFWKyEPGACTGSLASLEQYSEQLKDMVAFFIj 
G CS FS LtEEAL EKAGLPRRDPAGHS QAGAYKTT VPCVTHAGFCCP 
LWTMR PI P KDKIiEGtiVRACCSLGGEQC3Q PVHMGDPELLG I KEL 
SKPAYGDAMVCPPGBVPVFWPSPIiTSLGAVSSCETPLAPAS IPG 
CTVMTDLKDAKAPPGCLTPERlPEVHHISQDPLHysiASVSASQ 
KlRELESMIGIDPGNRGIGHLLCKDELLKASIiSLSHARSVLlTT 
GFPTHFNHE PPEETDGPPGAVAI»VAF1»QALEKE VA I I VDQRAWN 
LHQKIVEDAVEQGVLKTQIPILTYQGGSVEAAQAFLCKNGDPQT 
PR FDHLVA I ERAG RAADGNY YNAR KMNI KHLVD P IDDLFLAAKK 

IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEADFA 
VIAGVS NWGGYALACALY I LYS CAVHSQYLRKAVGPSRAPGDQA 

WTQALPSVIKBEKMLGILVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEMIQKIiVDVTTAQV 


5496 


3 


2408 


QDTKMHEIYKGNITPQLNKNTLKTSAATDVWAVYFSQFWIDY3G"" 

MKSGKGRPISPVDSFPLSIWICQPTRYAESQKEPOTCNQVSLNT 

SQSESSDIiAGRliKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIKIiLVHVHKHVSMQINHYQYLLLLFLHESLILLSE 

NI^KDVEANTTGSPASQTSICIGILLRSAElJUiLIaHPVDQANTLK 

SPVSESVSPWPDYIiPTENGDFIiSSKRXQISRDINRIRSVTVNH 

MSDKRSMSVDI^HI PLKDPIjLFKSASDTNLQKGI S FMDYLS DKH 

LGKI S E DE S SGLVY KSGSGE I GSETSDKKDSFYTDS SS VLN YR3 

DSNILSFDSDGNQNILSSTLTSKGNETIESIFKABDLLPEAASL 

SENLD ISKEETP P VRTLKSQ SSLSG KPKERCPPNLAPLCVS YXN 

MKRSSSQHSLDTISLDSMILEEQLIiESDGSDSHMFLEKGNKKNS 

TTNYRGTABSVNAGANLQWYGETSPDAISTNSEGAQENHDDLMS 

VWFKITGVNGEIDIRGEDTEICLQVWQVTPDQLGNISLRHYLC 

NRPVGSDQKAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFLQ 

CHIKWFSTEFLTSSI^IQHFLEDETVATVMPMKIQVSNTKINL 

KDDSPRS S TVS LEPAP VTVHI DHLWERS DDGS FH I RDSHMLNT 

GNDLKENVKS DS VLLTSGKYDIiKKQRS VTQ ATQTS PG VPWPSQS 

ANFPEFSFDFTREQLMEENESLKQEIiAKAKMALAEAHLEKDALL 
HHIKKMTVE 


5497 


1821 


3308 


S I SKLLKRRSNIDAYLLSNS CAFFAPRLFS LASQ1 IREQQS PNV 
CFIYKYSGFPSLECQCHFVSPHSSCYIWFFSFPPPFFVCFQLSM 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSQI PS WKDWAKPGPYDQPLVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGBEMEACEELA 
LALSRGLQLDTQRSSRDSLQCSSGYSTQTTTPCCSEDTIPSOVS 
DYDYFSVSGDQEADQQBFDKSSTIPRNSDtSQSYRRMFQAKRPA 
S TAGLPTTLGPAMVTPG VAT IRRTPSTKPS VRRGTIGAGP I PI K 
TPVIPVKTPTVPDLPGVLPAPPDGPEERG2HSPESPSVGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPSIPEEKRQAIPBSEAEDQBR 
EPPSATVSPGQI PESDPADLS PRDTPQGEDMLNAIRRG VKLKKT 
TTNDRSAPRFS 


5498 


2434 


1492 

1 


ILTHQE I FTGE KPCE CGKASI QMSHLSQQKI YSGENPFACKVCG 
KVFSHKSNLTEHEHFHTREKP PECNEGGKAFSQKQYVI KHQNTH 
TGEKLFECNECGKSFSQKENLLTHQKIHTGEKPFECKDC3GKAFI 
QKSNLIRHQRTHTGE KP F VCKECGKT FSGKSNliTEHEKIHI GEK 
PFKCSECGTAFGQKKY L IKHQNIHTGEKP YECNEOGKAFSQRTS 
LIVHVRIHSGDKP YECNVCGKAFSQS SSLTVHVRSHTGEKPYGC 
^ECGKAFSQFSTLALHLRIHTGKKPYQCSECGKAFSQKSHHIRH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nuol pnhi Hp 

llULiCUbJtUC 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, CsCysteine, D-Aspartic Acid, E=» 
Glutamic Acid, FoPhenylalanine, GKSlycine, 
H=Histidine, lalsoleucine, K«Lysine, 
L*Leucine, M=Methionine , N«*Asparagine , 
P=»Proline, Q«Glutamine, R=Arginine, 
S°Serine, T-Threonine, V=Valine, 
W»Tryptophan, Y=Tyxosine , X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
QKlHTk 


5499 


324 


926 


GFGQrGRGHKI?TYPFSPRKSGRKGMAQSQGWVKRYIKAFCKG~ 
FVAVPVAVT PLDRVACVAR VEGASMQ?SLNPGGSQSS DWLLNH 
WKVRNPEVHRGDIVSIiVSPKNPEQKIIKRVIALBGDIVRTIGHK 
NRYVKVPRGHI WVEGDHHGHSFDSNSPGPVSLGLLHAHATHI LW 
PPERWQKLESVLPPERLPVQREEB 


5500 


1978 


1286 


KPD WRLQNLP PRL YLWRS S RFGFGHLKKRLQMDFKI EHT WDG PP 
VKHEPVPIRLKPGDRGVMMDISAPFPRDPPAPLGBPGKPFNBLW 
DYEWBAPFXN D I TEQ YLE VBliCPHGQHLVLLLSGRRNVWKQElt 
PLSFRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDKRS 

YEALYPVPQHBLQQGQKPDFHCXEYFKSFNFWTLLGEEWKQPSS 
DLWLIBKCDI 


5501 


2327 


2226 


CRPPVSARVAPGHQGAVGGSGRRPARVSWDAAARPSSRPFSLP 
AAlMLALISRLLDWFRSLFWKBEMELTLVGLQYSGKTTFVNVlA 
SGQFSEDMIPTVGFNMRKVTKGNVTIKIWDIGGQPRFRSMMERY 
CRGVNAI VYMIDAADREK1 EASRNELHNLLDKPQLQGI p VLVIiG 
NKRDLPNALDEKQLIEKMNLSAIQPREICCYS ISCKEKDIf IDIT 
LQWLIQHSKSRRS 


5502 


' " 3 


824 


NSAFPVWVPERTALLTCPLGAAPGSSREAJPGIAGPPNSTAfiSKn" 
GKFFKGGGSSXSRAAPSPOEALVRLRETEEMLttPTlfnirVT pmd Tr\ 
REI ALAKKHGTQ^JKRAALQALKRKKIlFEKQLTQI DGTIiST I EFQ 
REALENS HTNTEVLRNMGFAAKAMKS VHENWDLNTKIDDLMQE I T 
EQQD1AQEISEAFSQRVGFGDDFDEDEI*MAELEELEQEELNKKM 
TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 
IKQLAAWAT 


5503 


216. 


654 


KGVRRRGR VRSDSE DSHLG Y F KMS FLX»P KLTS KKE VDQAiXSTA 
EKVLVLRFGRDEDPVCLQLDDILSKTSSDLSKMAAIYLVDVDQT 
AWTQYFDISYI PSTVFFFNGQHMKVDYGGEDPALRSIXAVRRT 
S PAGTLG E KPVKS 


5504 


58 


3563 


QLSFSFQAPVTFDDITVYLLQEBWVLLSQQQKBIiCGSNKLVAPIi 
GPTVANPELFRKFGRGPEPWLGSVQGQRSLLEHHPGKKQMGYMG 
EMEVQG PTRESGQSLP PQKKAYLSHLS TGSGH I EGD WAGRNRK L 
LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPSIRDKRSRL 
I EGYTG PFKVETLKYHAKSKAHMFCVNAIAARDPI WAARFRSIR 
DPPGDVLAS PEPLFTADCP I FYPPGPLGGFDSMAELLP5SRAEL 
BDPGGDGA1PAMYLDCISDLRQKEITDG1KSSSD1NILYXDAVE 
SCIQDPSABGLSEEVPWFEELPWFE0VAVYFTREEWGMLDKR 
QKELtTlDVMRMNYELLASLGPAAAKPDLISKLERRAAPWIKDPN 
GPKWGKGRPPGNKK^AVREADTQASAADSALLPGSPVEARASC 
CS5SICEEGDGPRRIKRTYRPRSIQRSV7FGQFPWLVIDPKETKL 
FCSACI ER PNLHD KS SRL VRG YTG PFKVETLKYHE VS KAHRLCV 
NTVS I KE DT PHTAL V PE I S S DLMANMEHF FNAAYS IAYH S RP LN 
DFEKILQLLQSTCTVIl^KYP^RTACTQFIKYTSETLKREILBD 
VRNSPCVS VLLDSSTDASEQACVGI YIRYFKQMEVKES YITLAP 
LYS ETADG YFET1 VS AI*D ELD I P FRKFG WWGLGTDGS AMLS CR 
GGLVEKFQE VI PQLLP VHCVAHRLHI&VVDACGS 1DLVKKCDRH 
IRTVFKFYOSS.VKRLNELQEGAAPLEQBIIRLKDLNAVRWASR 
RRTLHALLVS W PALARHLQRVAEAGGQ I GHRAKGKLKLMRGFHF 
VKFCHPLLDFLS I YRPLS E VCQKEIVL I TE VNATLGRAYVALES 
LRHQAGPKEEEFNASFKDGRlJiGICLDKI^VAEQRFQADRERTV 
LTGI BYLQQRFDADRP PQLICNMEVFDTMAWPSGIELASPGNDDI 

lniaryfecslptgyseealleewlglktiaqhlpfsmu:knal 

AQHCRFPLLS KLMAWVCVPI S TSCCERGFKAMNR IRTDERTKI* 
SNE^NMLMMTAVNGVAVTEYDPOPAIGHWYLTSSGRRFSHVYT 
CAQ VPARS PASARLR EGSEMGAIi YVEE PRTQKPP II*P SREAAEVL 
KDCIMEPPBRLLYPHTSQEAPGMS ! 


5505 


3312 


1219 


HQS P RSi*SAAK113NRNNNKLPSNL PQLQNL I KRDPPAY I EEFLQ 
QYNH YKSNVE IFKLQPNKPS KELAEL VWFMAQISHCYPE YLSNF 
PQE VKDLLSCNHT VLDPDLRMTFCKAIjI LLRNKNLINPS SLLEL 
FFELFRCHDKl^RKTLYTHIVTDIKNINAKHKIWKVNVVl/JNb^ 
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I 5EQ 
ZD 
NO: 


" Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end " 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid ^pnmpn t* ronf a< ni t\<t bimm^I ^..'lJ JT" ~~ 
civ-xva acymcui. v» u el i Xi a Ily Signal peptide 

(A=Alanine, C=» Cysteine, D^Aspartic Acid, E* 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H^Histidine, I-Isoleucine, K~ Lysine, 
L= Leucine, Methionine, N«Asparagine, 
P»Proline, Q-Glutamine, R=Arginine, 
S=5erine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








yTMLRDSNATAAKMSLDVMIBLYRRNIWNDAKTVNVITTACPSK 
VTK IL VAALTFFIX3KDED EKQDS DS BSEDJDGPTARDLLVQ YATG 
KKSSK^KKliEKAMKVljKKHRKKKKPRVFNFSATPT.THnpnni?^ 
BKLLKQLECCKBRFEVKMMLMNLISRLVGIHELFLFNFYPFLQR 
FLQPHQREVTKILLFAAQASHHLVPPBI iQSLIiMTVANNFVTDK 
NSGE VMTVGINAIKBI TARCPIiAMTBEXLQDLAQYKTHKDKNVM 
MSARTLIHLFRTLNPQMLQKKFRGKPTEAS1EARVQEYGELDAK 
DYIPGAEVLEVEKEENAENDBDGWESTSLSEEEDADGEWIDVQH 
SSDEEQQEISKKLNSMPMEBRKAKAAAISTSRVLTQBDFQKIRM 
AQMRKELDAAPGKSQKRKYtBIDSDEEPRGELLSLRDIERIiHKK 
PKSDKETRLATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLALRDALLKKKKRMK 


5506 


1 


1531 


hHGDLCGQRGGSAP^^SSAWPAPA^LPERERERFALCPGRS 
CSGGGGEETPGTTPVWSPLSGGGDEELRPNPYVRPPYRWWAWV 
ljHnp ' »i**/v»<j»J!» i f HAFFBfa VvrQbWFFRFVvNAAGYASFMVPGY 
IXVQYFRRKNYLETGRGLCFPIiVKACVFGNEPKASDEVPLAPRT 
EAAETTPMWQALKLLFCATGLQVSYLTWGVLQERVMTRSYGATA 
TSPGERFTDSQFLVIJWRVLAI>IVAGLSC^/LCKQPRHGAPMYRY 
SFASLSNVLS SWCQYEALKFVS FPTQVLAKAS KVT PVMLMGKLV 
SRRSYEHWEYLTATLIS IGVSMFLLSSGPEPRSS PATTLSGLIL 
LAG Y I AFDS FTS N WQDALFA YKMS SVQMM FCVNFFS CLFTVGSL 
LEQGALLEGTRFMGRHSEFAAHALLLS ICS ACGQLFIFYTIGQF 
GAAWTIIMTLRQAFAILLSCXLYGHTVTVVGGLGVAVVFAALL 
LRV YARGRLKQRG KKAVP VE3 PVQ KV 


5507 


3704 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAFGSLEIGGFGTAAGKK " 
VAVADVQFGPMRFHQDQLQVLIiVFTKEDKQCNGFCRACEKAGFK 
CTVTKEAQAVLACFLDKHHDIIIIDtlRNPRQLDASALCRSIRSS 
KI..SENTVIVGWRRVDREELSVMPFISAGFTRRYVENPNIMACY 
NELLQLEFGE\ r RSQLKLRACNSVFTALENSEDAISITSEDRFIQ 
YANPAFETTMG YQSGEL IGKBLGEVP INEKKADLLDT INS C I RI 
GKEWQG I YYAKKKNGDNIQQNVKI I P V IGQGG K I RHYVS 1 IRVC 
NGNNKAEKISBO/QSDTHTDNOTGKHKDRRKGSLDVKAVASRAX 
EVSSQRRHSSMARIHSMT1EAPITKVIN1INAAQESSPMPVTEA 
jjwk viL&xurtx *&Aix airUif\*/UuJUUPHANOljVGGw 
NEYVLSTKNTQ>1VSSNIITPISI,DDVPPRIARAMBNEEYWDFDI 
FELEAATHNRPLI YLGLKM FARFG I CEFLHCSE STLRS WLQI I E 
ANYHSSNP YHNSTHSADVLHATA YFLS KER IXETLDPI D3 VAAIi 
IAATIHDVDHPGRTNSFLCMAfiQRTATT.VTJTYraVT vquua&i no 

QLTTGDDKCNIFKNMERNDYRTLRQGI I DMVLATEMTKHFEHVN 
KPVKSINKPLATLBENGETDKNQEVINTMLRTPENRTLIKRMLI 
KCADVSNPCRPLQYCIEWAARISEEYFSQTDEBKQQGLPWMPV 
FDRNTCS IPKSQIS F I DYF I TDMFDAWDAFVD L PD LMQHLDNN? 
KYWKGLDEMKLRNLRPPPB 


5508 


1151 


*91 " 


I*SSVFSRRSASMFAVGCSMGPFLHYWYLSLDRLFPASGIiRGFPN 
VLKKVLVDQLVASPUXJVWYFI/?IX?CLEG(^GESCQELREKFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDTYDSYL 
KYR3PVPLTPPGCVALDTRAD 


5509 


1236 


619 


RKSRGCQNAI^ASGPAAAAAAIM\OlKLKFHEQKLLRQVDi?LNME 
VTDHNLH ELRVL RRYRLQRRED YTRYNQLS RAVRELARRLRDLP 
ERDQPRVRASAALLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAF VEQGHVRVG PDWTDPAFLVTRSM 
EDFVTWVDSSKIKRHVIiEYNEERDDFDLEA 


5510 


96 


119S 


PAGAHIiSSGS S EPLVEPGRGRVGAR VKGER^lliQASGS APGRS KM 
AEGERQPPPDSSEEAPPATQNFI I PKKEIHTVPDMGKV7KRSQAY 
ADYIGF I Ll'I^NEGVKGKKLTFEYRVSE A1E KIiVAl.LirrLDRW I D 
ETPPVDQPSRFGNKAYRTWYAKLDEEAENLVATWPTHLAAAVP 
EVAVYLKES VGNSTR IDYGTGHEAAFAAFLCCLCXIGVLRVDDQ 
IAIVFKVFNRYLEVMRiOiQKTYRMEPAGSQGVWGLDDFQFLPFI 
WGSSQLIDHPYLEPRHFVDEKAVNENHKDYMFLECILFITEMKT 
GPFAEHSNQLWNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 
KFGSLLPIHPVTSG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide* — 
(AoAlanine, CeCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, MoMechionine, N»Asparagine, 
Po Proline, Q»Glutamine, R-Arginine, 
S= Serine, To Threonine, V-Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *aStop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 


SSll 


276 


1980 


KLSRVLNLPPHJJLlTSX^Pt^QKKtlVADKQLSVDSLLEKDND'- 
HSRPDI Q VQAKRLAEKLRCDTWSB I STGQRTVN FKI NRELLTK 

TVLQQVIEDGSKYGLKSELFSGLPQKKIWEPSSPNVAKKPHVG 
HLRST I IGNF IANLKB ALGHQVIR IN YliGDWGMQPG ULGTG FQI) 
FGYBBKLQSNPLQHI»PBVYVQVNKBAADDXSVAKAAQBFFQRLE 
LGDVQALSLWQKFRDLSIBEYIRVYKRIiGVYFDEySGESFYREK 
SQEVLKhhES KGLLLKT1 KGTAWDLSGMGD PSSI CTVMRSDGT 
SLYATRDLAAAIDRMDKYNPDTMIYVTDKGQKKHFQQVFQMLK3! 
MGYDWAERCQHVP FGWQGMKTRRGDVTFLEDVIiNBIQLRMLQN 
MASIKTTKELKNPQETAERVGLAALIIQDFKGLLLSDYKFSWDR 
VFQS RGDTGVFLQYTH ARliHS LBETFGCG YLNDFNTACLQE PQS 
VS I LQHLLK FDEVLYKSSQDFQPRHI VS YLLTLSHLAAVAHXTL 
QI KDS P PE VAGARLHIjFKAVRS VLANGMKLLG ITPVCRM 


5512 


120 


1015 


DP SuLLT I T VTGVTVLVL VLKS MNSRRREPI TLQD PEAKYPJCpI* 
I E KEKI SHMTRRFRFGL P S P DH VLG LP VGN YVQLLAKI DNELW 
RAYTPVSSDDDRGFVDL 1 1 KI YFKNVHPQYPEGGKMTQ Y1»ENMK 
iGETIFFRGPRGRIJ'YJIGPGNLGIRPDQTSEPKKTLADHIiGMIA 
GGTGITPMLQLIREITKDPSDRTRMSLIFANOTRPnTT \rov~cr r> 

BIARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STLILVCGPPPLIQTAAHPNLBKLGYTQDMI FTY 


5513 


2 


837 


ARPn^PSDSPRIPPAGAETPGRGSCRNYLPSSS^PPPEPSSFPS 
P PTSRGGPGSRDTMSDSEEBSQDRQljKIWLGDGASG KTSLTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNLNVTLQIWDIGGQTIG 
G KMLD KY I YG AQGVLLVYD I TKYQS FENLEDW YT WKKVS E ES E 
TQP LVALVGN KIDLE HMRT I KPE KHLRFCOENG F<5 <?H FV<: a xrrr 

DSVFLCFQKVAAEILGIKIiNKABIEOSQRWKADXVNYNQBPMS 
RTVNPPRSSMCAVQ 


5514 


1295 


449 


WRPSWIMGNFRGHALP^TFFFlIGLWWCTiCSiLKVlCKKQl^T" 
CYLGSXTLFYRLBILEGITIVGMALTGMAGEQFrPGGPHLMLYD 
YKQGHWNQLLGWHHFTMYFFFGLLGVADILCFTISSLPVSLTKL 
MLSNALFVEAFI PYNHTHGREMLD I FVBQLL Vl» WFLTGL VA FL 
EFLVRNN VLLELLRSS I» ILIiQGSWFFQ IGFVLYP PSGGPAWDLM 
DHENIL FLT I CFCWHYAVTI VIVGMNYA FITWL VKSRLKRLCS S 
EVGLLKNAEREQESE5EM 


5515 


1572 


260 


FVRL VGRGDCD PLLS VCLTTM PLYBGLGS GGE kTAWI DLGEAF — 
TKCGFAGETGPRCI I P S VI KRAGMP K PVR WQYN INTEELYS YL 
KEFI HI IiYFRHLLVNPRDRRWT IBS VLCPSHFRETLTRVLFKY 
FBVPS VLLAPSHLMALLTLG INS AMVLDCG YR BS LVLP I YEGI P 
VLNCWGALPLGGKALHKELETQIiLEQCTVDTSVAKEQSLPSVMG 
SVFEGVLEDIKARTCFVSDLKRGLKIOAAKFNIDGNNE'RFSPPP 
NVDYPLDGEK1LHI LGS IRDS WEI LFEQDNEEQS VATLI LDSL 
IQCP IDTRKQLAENLWIGGTSMLPGFLHRLLAEIRYLVEKPKY 
KKALGTKTFRIHTPPAKANCVAWLGGAI FGALQDILGSRS VSKE 
YYNQ TGRI PDWCS LNWP PLEMM PDVGKTQ PPLMKRAFSTE K 


5S16T 


3 


735 " 


NSREPPOAGPGPSPRKSPTASSFLFPWRPIASSFffWGAQGAQES 
iKAWWRVPGTTRRPVTOESPGMHRPEAMlibLTiiALLGGPTWAG 
KMYGPGGGKYFS TTED YDHEI TGLRVS VGI»L L VKS VQVKCjGDS W 
DVKLGALGGNTQEVTLQPGEYITKVFVAFQAFLRGMVMYTSKDR 
YFYFGKLDGQISSAYPSQEGQVIiVGIYGQYQLLGIKSlGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR • 


5517 


246 


499 


S EI YVAMRTDSSKMTDVESG vanfassaragrrnalpdiqssaa 
TDGTSDLPLKLEALSVKEDAKEKDEKTTQDQLEKPQNEEK 


5518 


3 


1375 


DAWADAWVRAWDLNMDFPCLWLGLLLPI,VAALDFNYHRQEGMEA 
FLKT VAQNYSS VTHLHSIGKSVKGRNLWVLVVGR FPKEHRIG I P 
EFKYVAMiHGDF/rVGREI*IJJlLIDYLVTSDGKDPEITNLINSTR 
I HI MPSMNPDGFEAVKKPD C YYS IGRENYNQYDLNRNFPDAFSY 
lWSRQPETVAVHKNLKTETFVLSANUiQGALV2iS YPFDNGVQA 
TGALYSRSLTPDDDVFQYLAHTYASRNPNMKKGDECKNKMNFPN 
SVTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
S FWNNNKASLIBYI KQ VH LGVKGQVFDQNGN PLPNV I VE VQDRK 
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S2Q 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C* Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, FsPhenylalanine , G«Glycine, 
HaHistidine, Islsoleucine, K«Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
PteProline, Q=Glutamine, RsArginine, 
S^Serine, T=Threonine, V«Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\cpossible nucleotide insertion) 


I 








H I CP YRTNKYGE YYLLLLPGS YI IKfVTVPGHDPHI TKV 1 1 PEKS " 

QNFSALKKDIIiLPPQGQLDSlPVSNPSCPMIPLYRNLPDHSAAT 
KPSLFLFLVSLLHIFPK 






87 


! 477 


IKSKl^QQVEVQESEWRLTEAKGPTMGKBSGWDSGRAAVAAVVG 
GWAVGTVLVALSAMGFTSVGIAASS IAAKMMSTAAIANGGGVA 
AGSLVAILQSVGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 




5S20 


117 


943 


PTEGRQKVLKTFTVPRSAbAMTKTSTClYHPLVI^KYTFI^YTr" 
SQEGKDEV2CPXILAWGARWKYMTLLNLLLQTIPYGVTCLDDVLK 
RTKGGKDIKFLTAFRDLLPTTLAPPVSTFVFLAFWILFLYNRDL 
I Y PKVLDTVI ? VWLNHAMHTFI FPI TLAE VVLRPHS YPS KKTGI» 
TLLAAAS I AY I SR I LW LYFETGTWVY P VFAKLS LLGLAAF FSLS 

YVFIASIYLLGEKLNHWKWSVQILQRWRLESVGICFQWPDWKS 
PAKHQLVKNIR 


5521 


54 d 


911 


KILNMQKSCKENEGKPQNMPKAEMbRPLEDVPQEAEGNPQPSEE 
GVS0EAEGNPRGGPNQPG03FKEDTPVRHLDPEEMIRGVDE1.ER 
LREEIRRVRNKFVMMHWKQRHSRSRPYPVCFRP 




5522 


1224 


637 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITNYSRRF 
WQGSTDHRGVPGKPGRWTLVEDPAGCVWGVAYRLPVGKEEEVK 
AYIiDFREKGGYRTTTVIFYPKDPTTKPFSVLLYIGTCDNPDYLG 
PAPLEDIAEQtFNAAGPSGRNTEYI>FELANSIRNLVPEKADEHL 
PALE KliVKERLEGKQNLNC I 




5523 


3 


1280 


S KGKKRMGS SMS AATARRP VFDD KEDVNFDHFQ I LRAI G KGS FG 
KVCIVQKRDTEKMYAMKYtWKQQCIERDEVRNVFRELEILQElE 
HVFLVNLWYSFQDEEDMFMWDtiLLGGDLRYHLQQNVQFSEDTV 
RLYICEMALALDYLRGQHI IHRDVKPDNILLDERGHAHLTDFNI 
ATI IKDGBRATALSGTKP YMAPE I FHS FVNGGTG YSFE VDWWS V 
GVMAYELLRGWRPYDlHSSNAVBSIiVQIiFSTVSVQYVPTWSKEM 
VALLRKDLT VWPSHRLS SLQD VQAAPALAG VL WDHLSEKR VE PG 
FVPNKGRLHCDPTFELEEM I LESRPLHKKKKRLAKNKSRDNS RD 
SSQSENDYLQDCLDAI QQDFVI FNREKLKRSQDLPREPLPAPES 
RDAAEPVEDEAERSALPMCGPICPSAGSG 




" 5524 " 


85 


2319 


RERERDHRPGESSQGQSGAGGCFPSPTMEIiRCGGLliFSSRFDSG 
NLAHVEKVESLSSDGEGVGGGASALTSGIASSPDYEFNWfTRPD 
CAETE FENGNRSWFYFS VRGGMPGKL I KINIMNMN KQS KLYSQG 
MAPFVRTLPTRPRWER I RDR PTFEMTETQFVLS FVHRFVEGRGA 
TTFFAFC YPFS YS DCQELLNQLDQRF PENHPTHSS PLDT I YYHR 
BLLCTSLDGIiRVDLLTITSCHGLREDREPRLEQLFPDTSTPRPF 
RFAGKR I FFLSSRVHP GETP S S FVFNGFLDFI LR PDDPRAQTLR 
RLPVFKLI PMLNPDG WRGHYRTDSRGVNLNRQ YLKPDAVLHPA 
IYGAKAVtiLYHHVHSRLNSQSSSBHQPSSCLPPDAPVSDLEKAN 
NLQNEAQCGHSADRHNAEAWKQTEPAEQKIjNSVWIMPQQSAGLE 
ESAPDTI PPKESGVAY YVDLHGHASKRGCFMYGNS FS DESTQVE 
NMLYPKLISLNS AHFDFQGCNFSEKNMYARDRRDGQS KEGSGRV 
AI YKASGI IMSY TLE CNYNTGRS VNS I PAACHDNGRAS PPPPPA 
FPSRYTVELFBQVGRAMAIAALDMAECNPWPRrVLSEHSSLTNL 
RAWMLKHVRNSRGLSSTLNVGVNKKRGLRTPPKSHNGLPVSCSE 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKN3P3FPFHGSRPAGL 
PGlXSSSTQKVTHRVLGPVRGKPVWEPLQHVFGCLGfrCWGK 


5525 ' 
"3521 


105 


834 


SNTLDFERHLFlMGQQISDQTQLVIUKLPEKVAKHVTLVRBdGS " " 
LTYEEFI^RVAELNDVTAKVASGQEICHLIiFEVQPGSDSSAFWKV 
WRWCTKINKS SG t VEASRI MNLYQ F IQLYKDI TSQAAGVLAQ 
SSTSEBPDENSSSVTSCQASLWMGRVKQLTDEEECC1CMDGRAD 
L I LP CAHS FCQKC IDKW3DRHRNCPI CRLQMTX3ANES WWSDAP 
TEDDMANYILNMADEAGQPHRP 




3 


853 


RRPCNP VRAAKRTGAAARA PRGLE VTMLR VAWRTLS LI RTRAVT 
QVLVPGLPGGGSAKFPFNQWGLQPRSXLLQAARGYWRKPAQSR 
LDDDP PPSTLLKD YQNVPG I E KVDDVVKRLLSLEMANKXEMLKI 
KQEQFHKKIVANPEDTRSLBARI IALSVKIRS YEBHLEKHRKDK 
AHKRYLLMSIDQRKKMLKNLRKrNYDVFEKICWGLGIEYTFPPL 
YYRRAHRR FVTKKAliC I RVFQETQKLKKRRRALKAAAAAQKQAK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted endT 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D«*Aspnrtic Acid, B» 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P^Proline, Q«Glut amine, R^Arginine, 
SsSerine, T^Threonine , V=Valine, 
W=Tryptophan, YoTyroeine, X^= Unknown, +«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








RRNPDSPAKAIPKTLKDSQ 


5527 


3225 


565 


KFQTKGIKWGKWKEVKIDPNMPADGQMDDLVCFEELTDYQLVS 
PAKNPSSLFSKEAPKRXAQAVSEEEEEBEGKSSSPKKKIKLKKS 
KNVATEG TS TQFCE PE VKDPELEAQGDDMVCDD PEAGEMTSENLV 
QTAPKKKKNKGKKGLEPSQSTAAKVPKKAKTWIPEVHDQKADVS 
AWKD LFVPRP VLiRALS FLGFS APTP IQALTLA PAIRDKLDI LGA 
AETGSGKTLAFAIPM IHAVLQWQKRNAAPPPSNTEAPPGETRTE 
AGAKTRS PG KAE AES DAL PDDT V I ES EAL PS D IAAE ARAKTGG T 
VSDQALLPGDDDAGEGPSSLIREKPVPKQNEMEEENLDKEQTGN 
LKQELDD KSATCKAYP KR PLLGLVLTPTREIAVQVKQH IDAVAR 
FTG I KTAILVGGMSTQKQQRMLNRRPB I WATPGRL WELIKE KH 
YHLRNLRQLRCLWDEADRMVBKGHFAELSQLIiEMLNDSQYNPK 
RQTLVFSATliTLVHQAPARlLHKKHTKKMDKTAiaDLLMQKIGM 
RGKPKVIDI#TRNEATVETLTETK I H CETDEKDFYXY YF1MQ YPG 
RSLVFAN S 1 S C I KRLSGLLKVLD I M PLTLH ACMHQKQRLRNLEQ 
FARLED C VLL ATDVAARGLDI P ICVQHVIHYQ VPRTS E IYVHR5G 
RTARATNEGLSLMLIGPEDVINFKK1 YKTLKKDEDI PLFPVQTK 
YMDWKERI RLARQ I E KSE YRNFQACLHNS W I EQ AAAALE I ELE 
EDMYKGGKADQQEERRRQKQMKVLKKELRHLIiSQPLFTESQKTK 
YPTQSGKPPHiVSAPSKSESALSCLSKQKKKKTKKPKEPQPEQP 
QPSTSAN 


5528 


i 3 


895 


GPFLSACRMWGACKVKVHDSIiATI S ITLRRYLRLGATMAKS KFB 
YVRDFEADDTCLAHCWWVRLDGRNFHRFAEXHNFAKPNDSRAI* 
QLM7KCAQTVMEELED I VI AYGQSDEYS P VF KRKTNWFKRRAS K 
FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRWVYPSNQT 
I» KD YLS WRQADCHINNL YNT VFWAL IQQSGLTPVQAQGRLQGTIi 
AADKNEILFSEFNXNYNNEPPMYRKGTVLIWQKVDEVMTKEIKL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5523 


48 


640 


TFRLVSAHrjKTRKLINPEAAERRWRDWDSRQGWLSVKMQRVSGL 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPB 
KPNTLEELEWSBSCVEVQEINEEEYLVIIRFTPTVPHCSUITL 
IGLCLRVKLQRCLPFKHKLEIYISEGTHSTEEDINKQINDKERV 
AAAMEN PNLRE I VZQCVLE P D 


5530 ~ 


4541 


2606 


AQIVHAISYCHKLHVGHRDIiKPENWPFEKQGLVKLTDFGFSNK 
FQPGKKIjTTS CGSLAYSAPE I LLGDEYDAPAVD I WSLGVI L FML 
VCGQPPFQEAmJSETDTMIMDCKYTVPSHVSKECKDLITRMLQR 
DPKRRASLEEI ENHPWLQGVDPSPATKYNI PLVS YKNLSEBEHN 
S 1 IQRMVLGDIADRDAIVEALETmYNHITATYFLLAERILREK 
QEKEIQTRSASPSNIKAQFRQSWPTKIDVPQDLEDDLTATPLSH 
ATVPQS PARAADSVLNGHRSKGLCDSAKKDDLPELAGPALSrVp 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNRLTSRKSAPVLNQ IFEEGESDDEFDMDENLPPKLSRLKMNI 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKDSGF 
TYS WHR RDSSBGP PGSEGDGGGQS KPSNASGG VDKASPS ENNAG 
GGS P S S GS GGH PTNTSGTTR RCAG P SNS MQLAS RS AGEL VES I> K 
LMS L CLGS QLHGS TKY 1 1 DPQNGLS FSS VKVQEKS TWKMCI S S X 
GNAGQVPAVGG I KF FSDHMADTTTELER I KSKNLKNNVLQLPLC 
EKTISVNIQRNPKEGLLCASSPASCCHVI 


5531 


24 


515 


gsopraprprdsmerpepelirqswravsrsplehgtvlfarlf" 
alepdllplfqyncrqfss pbdcisspe fldh irkvmlv idaav 

TNVEDLSSLEE YLASLGRKHRAVGVKLSS FSTVGES LLYMLEKC 
LGPAFTPATRAAWSQLYGAWQAMSRGWDGE 


5532 


3395 


1402 


SDWMWGKRKMI I EJ3E TE FCGEELLHSVLQCKS VFDVLDGE EMR - 
RARTRANP YEM I RGWFIiNRAAMKMANMDF VFDRMFTNPRDS YG 
KPLVKDREAELLYFADVCAGPGGFSBYVLWRKKWHAKGFGMTLK 
GPNDFKliEDFYSASSELFBPYYGEGGIDGDGDITRPSNISAFRW 
FVrJ^TDRKGVHFLMADGGFSVEGQBNLQE ILSKQLLLCQFLMA 
LS IVRTGGHFI CKTFDLFT P FS VGLVYLL YCCFERVCLFKP ITS 
R PANS ERYWCKGLKVG IDDVRD YL FAVN I KLNQ LRNTDSD VML 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

co rr e sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptAHe - 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=»Phenylalanine, G«Glycine, 
H=Histidine, I-leoleucine, K-Lysine, 
L=Leucine, ^-Methionine, NaAeparagine, 
P»Proline, Q«Glut amine, R^Arginine, 
S=Serine, T=Threonine, V« Valine, 
W=Tryptophan, Yr»Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WPLEVIKGDHEFTDYMIRSNESHCSLQIKALAKIHAFVQDXTL 
SEPRQAEIRKECLRLWGIPDQARVAPSSSDPKSKPPELIQGTB1 
DI FSYKPTLLTSKTLEKI RPVFDYRCMVSGSEQKPL1GLGKSQI 
YTWDGRQSDRWIKLDLKTELPRDTLLSVEXVHBLKGBGKAQRK1 
SAIHILDVLVLNGTDVREQHPNQRIQLAEKFVKAVSKPSRPDMN 
PIRVKEVYRLEEMSKIFVRIiEMKIIKGSSGTPKLSYTCRDDRHF 
VPMGLYIVRTVNEPWTMGPSKSPKKKFFYNKKTKDSTFDLPADS 
IAPFHI CYYGRLPWEWGDG IRVHDSQKPQDQDKLS KEDVLS FIQ 
MHRA 


" " 5533 


94 


789 


MKERRAPQPWARCKLVLVGDVQCGKTAMLQVLAKDCYPETYVP 
TVFENYTACLETSEQRVELS LWDTSGS P YYDNVR P t,rY<5 n<; nav 
LLCFDISRPETVDSALKKWRTEILDYCPSTRVLLIGCKTDLRTD 
LSTLMELSHQKQAPISYEQGCA1AKQLGPE1YLEGSAFTSEKSI 
HSIFRTASMLCLNKPSPLPQKSPVRSLSKRLLHLPSRSELISPT 
FKKEKAKXCS IM 


"5534 


3 


605 


LVRGRARAANPGRVaAMDGLRQRVEHFLEQRNLVTEVLGALEAK 
TGVBKR YLAAGAVTLLS LYl*ti Ft3 YGAJ3T .T jTW t,TR ituvtj a vn e t w 

AI ES P S KDDDT VWLTYW WYALFGLAEFFSDLLLS WFP F YYVGK 
CAFLLF CMAPRP WNGALMLYQRWR P LFLRHHGAVDRIMNDIJSG 
RALDAAAGITRNVKPSQTPQPKDK 


553S 


1029 


332 


KSFMDSEARJQCSLVELSDTQDETQKSDSENEDLKIDCLQESQEL 
NLQKLKNSERI LTEAKGKMRELTVNT KMJCEWLT vtf r.T Kiwunii w 

SVSKQYTLFOn , KLEHDAEQAKVELTETQKQLQELENKDLSDVAM 
KVKLQKEFRKKVDAAKLRVQ VLQKKQ QDS KKLAS LSI QNEKRAN 
ELEQS VDHMKYQ K I Q LQRKLQEENE KRKQLDAV I KRDQG K I KVI 
LSYIPAKYNMKC 


5536 


942 


282 


AAATAASLSPRGCRLRTPSSDVSPSRA?PPSAAPijPTGRAQMSP 
SGRLCLLTI VGLI LPTRG QTLKDTTS SS SADAT IMDI QVPTRAP 
DAVYTELQPTSPTPTWFADETPQPQTQTQQLEOTDGPLVTDPET 
HKSTKAAHPTDDTTTLSERPSPSTDVQTDPQTLKPSGFHEDDPF 
FYDEHTLRKRGLLVAAVLFXTGIIILXSGKCRQIjSRLCRNHCR 


5537 


3 

i 


2391 


RARVSSPQLRVFRSGRPRRLRVLRINRTSVAliRbAGTGRFVAXT 
PGHPGS WEMGLLT FRDVAVEFS LEE WEKLEPAQKNLYODVMLEN 
YRNLVSLGLWSKPDLITFLEQRKEPWNVKSEETVAIQPDVFSH 
t*NKDLLTEHCTBASPQKVISRRHGSCDLE!JLHXRKRWKREECEG 
HNGCYDEKTFKYDQFDESSVESLFHQQIliSSCAKSYNFDQYRKV 
FTHSSLLNQQEB IDI WGKHHI YDKT3 VLFRQVSTLNS YRNVFIG 
EKNYHCNNSEKTLNQSSSPKNHQENYFLEKQYKCKE FSEVFLQS 
MHGQEKOEQS YKCNKCVEVCTQSUCH IQHQTIHIRENS YS YNKY 
DKDLSQSSNLRKQI 1 RNEEKP Y KCEKCGDSLNHS LHLTQHQ 1 1 P 
TEEKPYKWKECXSKVFNLNCSLYLTKQQQIDTGENLYKCKACSKS 
FTRSSNL IVHQRIHTGEKP YKCKECGKAFRCSS YLTKHKR IHTG 
EKPYKCKECGKAFNRS SCLTQHQTTHTGE KLYKCKVCS KS YAJRS 
SNLIMHQRVHTGEKPYKCKBCGKVFSRSSCLTQHRKIHTGENLY 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRIHTGBKPYKCKACSKSFSDSSGLTVHRRTKTGEKPYTCKE 
CGKAFSYSSDVIQHRRIHTGQRPYKCEECGKAFNYRSYLTTHQR 
SHTGERP Y KCEE CGKAFNSRS YLTTHRRRHTGERPYKCDECGKA 
F S YRS YLTTHRRS HS GER P YKCEE CGKA FNS RS YX I AHQRSHTR 
EKL 


5536 


926 


161 


HSMMM Ki P WGSI P VLMLLLLIiGL I DI S QAQLS CTG PPAI PG I PG ~ 
I PGTPGPDGQPGTPGI KGEKGLPGLAGDHGEFGEKGDPGIPGNP 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKTAFSATRTI 
NVPLRRDQTIRFDHVITNMNNNYEPRSGKFTCKVPGLYYFTYHA 
SSRGMLCVNLMRGRERAQKVVTFC^YAYNTFQVTTGGMVLKLEQ 
GENVFLQATDKNSLLGMEGANS IFSGFLLFPDMEA 


*539 


38 


1258 


HRGPSGAAAPGCALPRGQALBGPRSCRRPQPMARRYDELPbYPG " 
IVDGPAALASFPETVPAVPGPYGPHRPPQPIiPPGLDSDGLKREK 
DE I YGHPLFPLLAL VFE KCELATCS PRDGAGAGLGTPPGGDVCS 
SDSFNED1AAFAKQVRSERPLFSSNPELDNLVIQAIQVLRFHLL 
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SEQ 
XD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


-> wj-yiiajL. pcpciuc 

(A=Alanine, (^Cysteine, D-Aspartic Acid, E«» 
Glutamic Acid, F« Phenyl alanine, G*Glycine, 
H«Histidine, I=Isoleucine, K«Lysine, 
L«Leucine, M=Methionine, NoAsparagine, 
P= Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








BLEKVHOLCDNFCHRY I X CLKGKMPI DL VI BDR DGGCREDFSD Y " 
PASCPSLPDQNNMWIRDHEDS3SVHLGTPGPSSGGLASQSGDNS 
SDQGDGLDTSVASPSSGGEDBDLDQERRRNKKRGI FPKVATNIM 
RAWLFQKLSHPY PS EEQKKQLAQDTGLTILQVNNW F INARRRI V 

QPMIDQSNRTGQGAAFSPEGQPIGGYTETQPHVAVRPPGSVGMS 
LNLEGEWHYL 


5540 


148 


1440 


PPLG AG AG VHARS PHP ARRLP LTTAG VGGRAPDI.LPT-P wp nR»a 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYPELPHYPG IVD 
GPAALAS F PETVPAVPG P YGPHRPP QP LPPGLDSDGLKRB KDE I 
YGHPLFPLUUiVFEKCEIiATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDWTAFAKQVRS3RPLFSSNPELDNLMIQA1QVLRFHLLELE 
KGXMP IDLVTEDRDGGCR EDFEDYPASCPSLPDQNNI WI RDHBD 
SGSVHLGTPGPS SGG LASQSGDNSSDQGVGLDTS VAS PSSGGED 
EDLBQE PRRNKKRG I FPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLTI LQ VNNW F INARRR I VQ PMI DQSNRTGQGAAFSPEG 
QPIGGYTETEPHVAFRAPASVGDBFGTRKEEWHYL 


5541 ~ 


143 


1440 


PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDBI 
YGHPL FPLLALVFEKCELATCS PRDG AGAGLGTPPGGDVCS SDS 
FNE DNTAF AKQVRS ERPLFSS NPELDNLMIQA1 QVLRFHLLBLE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDOGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGI FPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLTILQVNNW FI NANR R I VQPMIDQSNRTGQGAAFS PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYI* 


5542 


146 


1440 


PPLG AGAG VHARS PHPARRLPIiTTATtvnrrp a pnr t DTPUDnuDn"" 

PSGAAAPGCALPRGQAIiEGPRSCRRPQPMARRYDELPHYPGIVD 

GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDBI 

YGHPLFPLLALVFEKCEIATCSPRDGAGAGLGTPPGGDVCSSDS 

FNEDNTAF AKQVRSER PLFSSNPELDNLM I QAI QVLRFHL LELB 

KGKMPIDLVIEDRDGGCREDFEJDYPASCPSLPt)QNNIWIRDHBD 

SGSVKLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGBD 

BDLDQEPRRNKKROI FPKVATNIMRAWLFQHLSHPYPSEEQKKQ 

LAQDTGLTI LQVNNWF INARRRI VQPMIDQSNRTGQGAAF3 P EG 

QPIGGYTETBPHVAFRAPASVGDBFGTRKEEWHYL 


5543 

5532 


2405 


665 


RWVRBQPWPLRTSEAVKl'PALRPFPGPRGVSPFPKPDWGKSPAP 
KRPFSDSGAFWS PERRPG VLEAPRRRPVPASFRAVP PKPTRVHG 
SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 
KBSRARRGPRGPSAFI PVEEVLREGAESLEQHLGIiEALMS SGRV 
DNliAVVMGLHPDYFTSFWRLHYLLLHTDGPLASSWRHYIAIMAA 
ARHQCS YIjVGSHMAEFLQTGGDPEWLLGLHRAPEKLRKLSE INK 
LIjAHRPWLITKEHIQALLKTGEHTWSLAELIQALVLLTHCHSLS 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERMQQLQESLLRDEGTSQEEMESRPELEKSESLL 

vtpsadilepsphpdmlcfvedptfgyedftrrgaqapptfraq 
DYTWEDHG YSL I qrlypeggqlldekfqaayslt YNTIAMHSG V 
lyrSVLRRAIWNYIHCVFGIRYDDYDYGEVNQLLERNLKVYIKTV 
ACypEKTTRRMYNLFWPJ{FRHSEKVirVNLIjIiLEARMQAAIiLYAL 
RAITRYMT 




1895 


514 


LGGLLGRQRLLLRMGAGR LGAPMBRHGRASATS S AGE QAAGD 
PEGRRQ EPLRRRAS S AS VPAVGAS AEGTRRDRLGSYSGPTS VSR 
QRVESLRKKRPLF PWFGLD IGGTLVKLVYFEPKDITAEEEEEEV 

ESLKS irkyltsnvaygstgirdvhlelkdltlcgrkgnlhfir 

FPTHDMPAF IQMGRDKNFS SLHTVFCATGOGAYKFEQDFLT IGD 
LQLCKLDELDCLIKGILYIDSVGFNGRSQCYYFENPADSEKCQK 
LPFDLKNPYPLLLVNIGSGVSILAVYSKDNYKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVRDIYGGDYBRFG 
LPGWAVASS FGNMMSKEKREAVS KEDLARATLITI TNNIGS IAR 
MCALNENINQWPVGNFLRINTIAMPXIAYALDYWSXGQLKALF 
SEHEGYFGAVGALLELLK1P 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rjuiiiu av.a.u Beyinenc conLaining Signal pspCicie 
(AsAlanine, C=Cysteine, DcAspartic Acid, E» 
Glutamic Acid, F« Phenylalanine, G^Glycine, 
H»Hietidine, I»Isoleucine, KeLysine, 
L»Leucine, M«>Methionine, N^sparagine, 
P=»Proline, Q=Glut amine, R=Arginine, 
SaSerine, TaThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XoUnknown, *«SCop 
Codon, /=possible nucleotide deletion, 
\=»possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVUX3LLLALLVPGGGAAKTGAELVTCGSVL 
KLLNTHHRVRLHS FOX KYGSGSGQQSVTGVEASDDANS YNR IRQ 
GSEGGC PRG S P VRCGQAVRLTHVLTG KNLHTHHF PS PLSNNQE V 
SAFGBDGEGDDLDLWTVRCSGQHWEREAAVRFQHVGTSV?LSVT 
GEQYGS PIRGQHEVHGMPSANTHNTW KAMEG1PI KPSVEPSAGH 
DEL 


5546 


1592 


146 


FVP RGGHS S MGQS GRS RHQKRARAQAQ LRNLEAYAAN PHS P VPT 
RGCTGRNTROLS LD VRR VM V PF ,TA <? P f.n VP nr WTff Q T .tmrnntra n o 
LGVTHFLI LSKTETNVYFKLMRLPGG PTLTFQVKKYS LVRDWS 
SIiRRHRMHBQQFAHPPLLVLNSFGPHGMHVKIiMATMFQNLFPSI 
NVHKVNLNT IKRCLLIDYNPDSQELDFRHYS I KWPVGASRGWK 
KLLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAOQSAVRLTE IGPRMTLQLIKVOEGVGBGKVMPHS 
FVS KTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKAR VGGSDEEA8 GI PSR TASLELG EDDDEQEDDDI 
EYFCQAVGEAP S EDLF P EAKQ KRLAKS PGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRFGKRVA 


" 5547 


1592 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNLEAYAANPHSFVFT 
KOI, ITjKN ±KUIj&IjUVHRVWBPLrASRLQvRKKNSLKDCVAV 
LG VTHFLILSKTETNVYFKLMRLPGG PTLTFQVKKYS LVRDWS 
SLRRHRMHEQQ FAHPPLLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNT IKRCLLIDYNPDSQELDFRHYS I KWPVGASRGMK 
KLLQEKFPNMSRLQDIS ELLATGAGLSESBAEPDGDHNI TELPQ 
AVAGRGNMRAOQSAVRLTEIGPRMTLQLTKVQEGVGEGKVMFHS 
FVS KTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKS LEGMKKARVGGSD EEASGI PS RTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKS PGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRG PRGASR DGGRGRG RG R PGKRVA 


~~554 8 


1 


2153 


DQTGP P ETIAFT FP RSTM B P LCP LLLVGFS LPLARALRGNETTA 
DSNETTTTSGPPDPGASQPLLAWLLLPLLLLLLVLLLAAYFFRF 
RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLE EEIR I RS ADDCKQFREE FNSLP 5GHIQGTFELANKE EN 
REKNRYPNTLPrJDHSRVIjLSQLDGIPCSDyiNASYIDGYKEKNK 
FIAAQGPKQE TVND FWRMVWEQKSATI VMLTNLKERKEEKCHQY 
WPDQGCWTYGNIRVCVEDCWLVDYTIRKFCIQPQLPDGCXAPR 

LVSOIjHT?T , ^WPnPrJUOP^'OT<^MT.lf T7T.IT KTTMTUTvrnTinrjn 
u»uyunr x onri-ir uvrr J v ■LVTPiLtiv.J? Jjmvv.&x Livlir VciJ\\Jtr X V V^1L» 

S AGVGRT13TFI VTDAMMAMMHAEQKVD VPBFVS R IRNQRPQMVQ 
TDMQYTF I YQALLE YYLYGDTELEWS SLEKHLQTMHGTTTHFDK 
IGLBE EFRKLTNVR IMKENMRTGNLP ANMKKARV IQ 1 1 P YDFNR 
VILSMKKGQEYTDYrNASFITOYRQKXJYFIATQGPLAHTVEDFW 
RMIWEWKSHTIVMLTEVQBREQDKCYQYWPTEGSVTHGEITIEI 
KNDTLSEAISIRDPLVTLNQPQARQEEQVRWRQFHFHGWPE IG 
IPAEGKGM IDLI AAVQKQQQQTGNHP ITVHCSAGAGRTOTF IAL 
SNILERVKAEGLLDVFQAVKSLRLQRPHMVQTLEQYBFCYKVVQ 
DFID1 FSDYANPK 


5549 


915 


256 


FEATGGKRIAFKMAGTARHDREMAIQAKKKLTTATDPIERLRLQ 
CLARGS AG X KG LG RVFR IMDDD2JNRTLDFKEFMKGLNDYAWME 
KEEVEELFQRFDKDGNGTI D FNEFLLTLRPPMSRARKEV3 MQAF 
RKLDKTGDGVI T I EDLRE VYNAKHHP KYQNGE WS E EQ VFRKFLD 
NFDSPYDKDGLVTPEEFMN YYAGVSAS IDTDVYPT iMMRTAMTPr. 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLSLVKELDAFPKVPESYVETSASGGTV 
SLIAFTTMALLTIME FSVYQDTWMKYEYEVDKDFSSKLRIN IDI 
TVAWK CQ Y VG AD VL DrjAETMVAS ADGL VYR PTVFD T »S PQQKEWQ 
RMLQL I QSRLQEEHSLQD VI FKS AFKSTSTALPP REDDSS QS PN 
ACRIHGHLYVNKVAGNFHITVGKAIPHPRGHAHIAALVNHESYN 
FSHRI DHLS FGBLVPAI INPLDGTEKI AI DHNQMFQYFI T WPT 
KLHTYKrSADTHQFSVTERERIINHAAGSHGVSGIFMKYDLSSL 
MVTVTEEHMPFWQFFVRLCG I VGGI FSTTGMLHG I GKFI VE I IC 
CRFRLGS YKP VNS VP FEDGHTDNHLPLLENNTH 


5551 


211 


1700 


MQRDHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVLVSVGRSfi "" 
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SEQ 

ID 

MO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Ala nine, C=Cysteine, D»Aspartic Acid, 5= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, NaAeparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T-Threonine, V* Valine, 
W»Tryptophan, Y-Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


~5S52 






WFVFRRYAEFDKLYNTLKKQyPAMALKlPAKRIFG^FDPbFIK 

QRRAGt.NEFIQNLVRYPEI*YNHPDVRAFLOMDSPKHQSDPSEDB 

DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 

LIAKRKLIXSKFYAVKVLQKKIVlJ^KEQKHIMAERNVt 

PFLVGLHYSFC^EKLYFVLDFVNGGBLFPHliQRERSFPEHRAR 

FYAAE IAS AIiG YLHS I KI VYRDLKPEN ILLDSVGHVVLTD FGL C 

KEG I AI S DTTTT FCGT PEYLAPE VI RKQP YDNT VDWWCLG AVL Y 

EMLYGLPPFYCRDVAEMYDNILHKPLSLRPGVSLTAWSILEBIjL 

EKDRQNRLGAKEDFLEIQNHPFFESLSWADLVQKKIPPPFNPNV 

agpddirnfdtapteetvpysvcvssdysivnasvlbaddafvg 
fsyappsbdlfl 




274 B 


930 


lgpaagaamgkkhkkhkaewrssyedyadkplekplklvlkvgg 

SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHI. 

ddeerrkrkeekkrkrerehcdtegeaddfdpgkkvevepppdr 

PVRACRTQPAENESTPIQQLLBHFIiRQLQRKDPHGFFAFPVTDA 

iapgysmiikhpmdfgtmkdkivaneyksvtefkadfklnicdna 

MTVNRPDTVYYKLAKKI LHAG FKMMSKQAALLGNEDTAVEEPVP 
EVVPVQVETAKKSKKPSREVI S CMFEPEGNACSLTDSTABEH VI* 

alvehaadeardrinrflpggkmgyi^rngdgsllysvvntaep 

DADEEETHPVDLSSLSSKLLPGFTTLGFKDBRRNKVTFL3SATT 
ALSMQNNSVFGDL2CSDEMELLYSAYGDETGVQCALSLQEFVKDA 
GS YS KKWPDLLDQI TGGDHSRTLFQLKQRRNVPMKPPDSAKVG 
DTliGDSSSSVLEFMSMKSYPDVSVDlSMLSSLGKVKKELDPDDS 
HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDQHHL 
GSPSRLSVGEQPDVTHDPYEFLQSPEPAASAKT 


5553 
~5554 


74 


1095 


LGREAVYLVSRMDGPVAEHAKQE PFHVVTPJULBSWALSOVAGMP 
VFLKCENVQPSGSFKIRGIGHFCQEMAKKGCRHLVCSSGGNAGI 
AAAYAARKLGIPATIVLPESTSLOWQRLQGEGAEVQLTCKVWD 
EANLRAQELAKRDGWENVP P FDH PLI WKGHASLVQELKAVLRTP 
PGALVLAVGGGGLLAG WAGLLEVGWQH VPI I AMETKGAHCFHA 
AITAGKLVTLPDITSVAKSLGAKTVAARALECMQVCKIHSEVVE 
DTEAVSAVQQLLDDERMLVEPACGAALAAI YSGJULRRLQAEGCL 
PPSLTSVWIVCGGNNINSRELQALKTHIjGQV 




166 


2310 


CSGRTGGRGSI^PAENVCLTCKLSGAETRGLLCPALRTWIMKVL 

GRSFFWVLFPVLPWAVQAVEHEEVAQRVIKLHRGRGVAAMQSRQ 

YIVRDS CRKLSGLLRQKNAVLiYKLKTAIGAVEKDVGLSDEEKLFQ 

VHTFE I FQKRLNES ENS VFQAVYG LQRALQGDY KDWNMKES SR 

QRLEALRFAAIKEETEYMELLAAEKHQVEALKNMQHQNQSLSML 

DE ILEDVRKAADRIiEEEI E EHAFDDNKS VKGVW FEAVLRVEEBE 

ANSKQNITKREVEDDLGLSMLIDSQNWQYILTKPRDSTIPRADH 

HFIKD IVTIGMLS h PCG WLCTAIGLPTMFGYI I CGVULGPSGLN 

SIKSIVQVBTLGEFGVFFTLFLVGLEFSPEKLRKVWKISLQGPC 

YfTTLLMlAFGLLWGrHOjLRIKPTQSVFIsrCLSLSSTPLVSRFLM 

GSARGDXEGDIDYSTVLLGMIiVTQDVQLGLFKAVMPTLrQAGAS 

ASSSIWBVLRILVLIGQILFSLAAVFLLCLVIKKYL1GPYYRK 

LHMESKGxVKEILILGISAFIFLMLTVTELLDVSMELGCFLAGAL 

VSSQGPWTBEIATSISPIRDFLAIVFFASIGLHVFPTFVAYEI, 

TVLVFLTLSVVVMKFLLAAL VLS III I» PRSSQYI KWI VSAGLAQ V 

53FSFVLGSRARRAGVISREVYLLILSVTTLSLLLAPVLWRAAI 

TRCVPRPERRSSL 


5555 
55§£ 


212 

"" 5635 


1425 
3346 


LSLRTRETPAPPRCEAASQGRVGWPADAAAEEAVRSVWNRTRDR 
GTMAP0NL5rFa.LLLYLIGAVrAGRJDFYKriiGVPRSASIKDlK 
ftAAiuu^Mivunrutm ^UUr UAQbKFQDIjGAAYcVLSDSEKRKQY 
DTYGEEGLKDGHQSSHGDI FSHFFGDFG FM FGGT PRQQDRNT PR 
GSDI rVDLEVTLEEVYAGNFVEVVRNKPVARQAPGKRKCNCRQE 
MRTTQLGPGRFQMTQEVGCDBCPNVKLVNEERTLEVEIEPGVRD 
GHEYPFIGEGEPHVDGEPGDLRFRIKWKHPIFERRGDDLYTNV 
TISLVESLVGFEKDITHLDGHKVHrSROKITRPGAKLWJCKGEGL 
PNFDNNNIKGSLIITFDVDFPKEQLTEEAREGIKQLLKQGSVQK 
VYNGLQGY 

RTRG^SKNCV^^yEEYLLRMFQGTFYLLQKITKDNNAHTVKSR 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
(A^Alanine, ocyateine, D^Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N°Asparagine , 
P»Proline, Q«Glut amine, R=Arginine, 
S«Serine, ToThreonine, V*Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








I££tDES)fIEKi^DFIJ^PVsVHLR^SsVSQPPVVEFrfLLFK 
YTFHQPTHEGYFSCLDIWTLFLDYLTSK1KSRLGDKBAVLNRYE 
DALVLLLTEVLNR IQ FRYNQAQLEELDDETLDDDQQTEWQRY LR 
QS I»E WAKVMELLPTHAFSTLPP VLQDNLEVYLGLQQF I VTSGS 
GHRLNI TASNDCRRLHCS LRM,SSLI*QAVGRLAEYF 1 GDVFAAR 
FNDALTVV ERL VKVTL YGSQ IKI>YN I ETAVP S VLKPDL IDVHAQ 
SLAALQAYSHWLAQYCSEVHRQNTQQPVTIiISTTWDAITPLIST 
KVQDKLLLSACHLLVSLATTVRP VPLIS IPAVQKVFURITDASA 
LRLVDKAQVLVCRALSNI LLLPWPNLPENEQQWPVRS INHASLI 
SALSRDYRNLKPSAVAPQRKMPLDDTKLI IHQTLSVLEDIVENI 
SG ESTKSRQI CYQSLQES VQVSLALFPAFI HQSDVTDEMLS FFL 
TLFRGLRVQMGVPFTEQ I IQTFLKMFTREQLAESILHEGSTGCR 
WEKFLKILQVWQEPGQVFKPFLPS 1 1 ALCMEQVY P 1 1 AERPS 
PDVKABLFELLFRTLHHNWRYFFKSTVLASVQRGIAEEOMENEP 
GFSAIMQAFGQSFLQPDIHLFKQNLFYLETLNTKQKLYHKKIFR 
TAMLFQFVNVLLQVLVHKSHDLLQEEIGIAIYNMASVDFDGFFA 
APLPEFLTS CDGVDANQKSVLGRUFKMDRVRRERGRAKRRABWA 
RKPGTCAARRGHIEASGRGLCPPCSLAAAHEMPADLVL 


* 5557 


1712 


491 


VILGAGLRDKDMWI P WGLPRRLRLS ALAGAGRFCILGS EAATR 
KHLPARNHCGLSDSSPQLWPBPDFRNPPRKASKASLDFKRYVTD 
RRI*A5TLAQ I YIX3KP S R P PHLLLECNPGPG ILTQALLEAG AKW 
ALESDKTFIPHLESLGKNLDGKLRVIHCDPFKLDPRSGGVIKPP 
AMSSRGLFKNLGIEAVpWTADIPIiKVVGMPPSRGEICRALWKLAY 
DLYS CTSI YKFGRIEVNMFIGEKEPQKLMADPGNPDLYHVLS VI 
WQLACEIKVLHMEPWSSFDIYTRKQPLENPKRRELLDQLQQKLY 
LIQMIPRQNLFTKNLTPMMYNIFFHIjLKHCFGRRSATVIDHI*RS 
LTPLDARDILMQ1GKQED3KVVNMHPQDPKTLFETIERSKDCAY 
KWLYDETLEDR 


5558 


1509 


96 


RAGCTHPQ VPADLGAPAE PRRPQKTCVCLLQPQPGGQRGPTTMI " 
ItTVTPSMRLWTPVGVLTSIAYCLHQRRVAIAELQEADGQCPVDRS 
LLKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLLEVPPQTQFD 
YTVTNLAGGPKPYSPYDSQYHETTLKGGMFAGQLTKVGMQQMFA 
LGERLRKNYVED1PFLSPTFNPQEVF1RSTN1FRNLESTRCLIA 
GLFQCQKEGPI I IHTDBADSEVLYPNYQSCWSIiRQRTRGRRQTA 
SLQPGISEDLKKVKDRMGrr^SSDKVDFFILliDNVAAEQAHNLPS 
CPMLKR FARMI EQRAVDTSLYI L P KEDRES LQMAVGP Pi»H 1 1, ES 
NLIiKAMDSATAPDKIRKLYliYAAHD VTF I P LLMTLGI FDHRW PP 
FAVDLTMELYQHLESKE WFVQL YYHGKEQ VPRGf? PDGLCPLDMF 
LNAMS VYTLS PEKYHALCSQTQVMEVGNEE 


5559 


150 


1993 


PLAATAHFAKMSRVAKYRRQVSEDPD I DSLLETLS PEEMEELEK 
ELDVVDPDGSVPVGLRQRNQTEKQSTGVYlTOEAMIiNFCBKETKK 
LMQREMSMDESKQVBTKTDAKNGEBRGRDASKKALGPRRDSDLG 
KEPKRGGLKKSFSRDRDEAGGKSGEKPKEBKIIRGIDKGRVRAA 
VDKKEAGKDGRGEERAVATICKEEE KKGS DRNTGLSRDKDKKR EE 
MKBVAKKEDDEKVKGERRNTDTRKEGE KMKRAGGNTDMKKEDEK 
VKRGTGNTDTK1ODDEKVKKNEPLHEKEAKD0SKTKTPBKQTPSG 
PTKPSEGPAKVEEEAAPS I FDEPLERVKNNDPEMTSVNVNNSDC 
ITNEILVRFTEALEFNTVVKLFAIiANTRADDHVAFAIAIMLKAN 
KTITSJLNLDSNHITGKGILAIFRALLQNNTLTELRFHNQRHI CG 
GKTEWEIAKLLKENTTLLKLGYHFElAGPRMTVTNIiLSRWMDKQ 
RQKRLQEQRQAQ EAKGE KKDLLE V PKAGAVAKGS PKPS PQ P S P K 
PSPKNSPKKGGAPAAPPPPPPPLAPPMMENLKNSSSPATQRKM 
GDKVLPAQEKNSRDQLLAArRSSNI«KQLKKVEVPKLI»Q 


5560 


9 


921 


SS WEFS ALS VSMACLS P SQLQKFQQDG FLVLEGPIjSAE E CVAM 

qqrigelvabmdvplhcrtefstqeeeqlraqgstdyflissgdk 
irfffekgvfdekgnflvppeksinkighalhahdpvfks iths 
fkvqtlarsiiglqmpvwqsmyifkqph fggevs phqdas flyt 
EPLGRVLGVWIAVEDATLENGCLWFI pgshtsgvsrrmvrapvg 
SAPGTS FLGS EPARDNSXi FVPTPVQRGALVLIHGEWHKS KQNL 
SDRSRQAYTFHLMEASGTTW3PENWLQPTAELPPPQLYT 


5561 


2175 


1775 


CYFIFQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSI4SPGQPPPQ 
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amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide — 
(A»Alanine, C«Cysteine, D^Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=*Histidine, I»Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N«Asparagine , 
P«Proline, Q=Glut amine, R^Arginine, 
S=serine, T=Threonine, VaValine, 
WeTryptophan, Y=Tyrosine, XsUnknown, *oStop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








QLLAPTYFSAPGW1TFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
QVYG GVTY YNPAQQQVQ PKPS P PRRT PQP VTI KPPP PEWSRGS 
S 




342 


1385 


SSGKNDMAAAGAAGLVRGIiKAGVLSQADYLN^ 

LQS 7D YGK FLANEAS PLTVS VIDDRL KEKMWE FRHMRNHAYEP 

LASFIjDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFB 

QMEAVNTAOTPAELYNAILVDTPLAAFFQDCIS BQDLDEMNI EI 

IRNTI*Y KAYLES FYKFC TLLGGTTADAMC P I LE PEADRRAPI IT 

INSFGTELS KBDRAKL FPHCGRL YP EGIAQliARADDYEQ VKNVA 

DYYPEYIOJiFEGAGSNPGDFCTIjEDRFFEHEVKLNKLAFLNQFHF 

gvfyafvklkeqecrnivwiaeciaqrhrakidnyipif 


5563 


342 


1385 


SSGmDmAAGAAOLVRGLKAGVLSQADyWLVQCETLBDLKLH 
LQSTDYGNPIJU^EASPLTVS\riDDRLKEKMVVEFRHMRNIIAYEP 
LAS FLD FI T YS YM IDNVI LLI TCTLHQRS IABLVP KG I PLGS FE 
QMEAVN IAQTPAE LYN AI LVDTPLAAFFQDCISEQDLDEMNI EI 
IRNTLYKAYLES FYKFCTLLGSTTADAMCP1 I>EFEADRRAFI IT 
INSFGTELSKEDPJlKLFPHCXSRLYPEGI^OLARADnvPrtvvKrvA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFBHEVKLlJXIAFU f Q FHP 
GVFYAFVKLKEQECRNIVWIABCIAQRHRAKIDNYIPIF 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAQVGAWRTGAtGLALLL ' 
LLGLGLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGLC 
VPLTWRCDRDLD CSDGS DEEECRIEPCTQKGQCP P P PGL PCPCT 

GVSDCSGGTDKKbRNCSRIJVCLAGELRCTLSDDCIPLTWRCDGH 
PDCPDSSDELGCGTNEILPEGDATTMGPPVTT,T?<iVT<;T dv&ttm 

GPPVTLES VPS VGNATSSSAGDQSGS PTAYG VI AAAAVLSAS LV 
TATLLLLS WLRAQ ERLR PLG LLVAMKESLLLSEQ KTS L P 


5565 


993 


138 


RWNSPNPARAGS I S RPQRAPGS VSAVAMTAAVFFGCAFIAFG PA 
IALYVFTIATEPLRIIPLIAGAFFMLVSIjGISSLVWFMARVIID 
NKDGPTQKYLLI FGAFVSVY IQEMFRFAYYKLLKKASEGLKS IN 
PGETAPSMRLtiAYVSGLGFGIMSGVPSFVJTIXiSDSI^PGTVGIH 
GDSPQFFLYSAFMTLVI ILLHVFWG IVFFDGCK fOTKWRTT t tut 

LTHLLVSAQTFISSYYGINLASAFIILVLMGTWAFLAAGGSCRS 
LKLCLLCQDKNFLLYNQRSR 


5566 


2043 


1232 


SHIQHHGRGAQAPVKMVSWMISRAWLVFGMLYPAYYSYKAVKT " 
K^KEYVRW^YWIVFALYTVISTVADQTVAWFPLYYELKIAFV 
IWLLSPYTKGASLIYRKFLHPLLSSKEREIDDYIVQAKERGYET 
MVNFGRQGLNLAATAAVTAAVKS QGA I TERLRSFSMHDLTTIQG 
DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEfCTDEEA 

EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 
RPQVYF 


5567 


1554 


233 


EFLGSGVSPDLANEDGtTALHQCCIDDFREMVQQLLEAGANINA 
CDS ECWTPLHAAATCGHLHLVELLI ASGANLLAVNTDGNMP YDL 
CDDEQTLD CLETAMADRGI TQDS I EAARAVPEZjRMLDD IRS RLQ 
AGADLHAPLDHGATLLHVAAANGFSEAAALLLEHRASLSAKDQD 
GWEPLHAAAYWGQVPIiVELLVAHGADLNAKSLMDETPLDVCGDE 
EVRAKLLELKHKHDAIiLRAOSRQRSLLRRRTSSAGSRGKVVRRV 
S LTQRTDL YR KQHAQEAI VWQQP P PTS PE P PEDNDDRQTGABLR 
P P PPEEDNPEWRPHNGRVGGSP VRHLYS KRLDRSVS YQLS PLD 
STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPE3PBTAEP 
GLPGDTVTPQPDCGFRAGGDPPLLKLTAPAVEAPVERRPCCLLM 


5568 
S569 


1731 
2 


587 

' 835 "f 


AEnRQPASRkGAGTTAAMAASGPGCRSWCLCPEVPSATFFTALL " 
SLLVSGPRLFLLQQPLAPSGIiTLKSEALRNWQVYRLVTYI FVYE 
NP ISLLCGA I 1 1 WR FAGNFERTVGTVRHCFFTVI FAIFSAI IPL 
SFEAVSSLS XLG E VEDARGFT P VAFAMLGVT T VRSRMRRALVFG 
MWPSVLVPKLLLGASWLIPQTSFIiSNVCaLSIGLAYGLTYCYS 
IDLSERVALKLDQTFPFSLMRRISVFKYVSGSSAERRAAQSRKL 
NPVPGS YPTQSCHPHLS PSHPVSQTQHASGQKLASWPS CTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VNS PGTVYSGALGTPGAAGSKESSRVPMP 

QTPCPLAWERGSRSEDISVPGQKPPTCSSFSCiMbVGPSSLPHLG " 
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to first 
amino acid 
residue of 
amino acid 
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1 Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino Q.ci.d seennent cMtainina 1 oiftnai nanhi 

(A^Alanine, C=cysteine, DsAspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, KnLysine, 
L=*Leucine, M^Methionine , N=Asparagine ( 
Pcproline, Q=Glutamine, R»Arginine, 
S=Serlne, T= Threonine, v= valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








DKLLLLIiLLIiPLRGQANTGCYGIPGMPGLPGAPGKDGYDGLPGP 
KGEPG I PAIPGIRGPKGQKGEPGItPGHPGKNGPMGPPGMPGVPG 
PMGIPGEPGBEGRYKQKFQSVTTVTRQTHQPPAPNSL1RFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVLLYRSGVK 
WTFCGHTSKTN0VNSGGVLLRLQVGEEVW1AVNDYYDMVGIQG 
SDSVFSGFLLFPD 


SS70 


264 


946 


RDRRDRGGVATSTBEPARPRAPQSRGPGPV5QTGRGRERGGGDT 
MSS PS PGKRRMDTDVVKL I ESKHBVTILGGLNBFVVKPYGPQGT 
P YEG GVWKVRVDLPDKYP FKSPS 1 nFMUJTT FW DM T nv a cr"r\rr>T 

DVINQTWTALYDLTNIFESFliPQLLAYPNPIDPLWGDAAAMYLH 
RPEBYKQKIKRYIQKYATEEALKEQEBGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MS S PS PG KRRMDTD WKL I ESKHEVT ILGGLNE FWKFYGPQGT 
PYEGGVWfCVRVDLPDKY P PKS P ^ TfiFMMWT PWdmt nwa cr^tnrr'T 

DVINQTHTALYDLTNIFESFLPQLliAYPNPIDPLNGDAAAMYLH 
RPBBYKQKIKEYIQICYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 " 


2802 


2085 


RTDYRTGIPGRRFRVMAAGDGDVKLGTLGSGSESSNDGGSESPG 
DAGAAABGGGWAAAALALLTGGGEMLLNVALVALVLLGAYRLWV 
RWGRKGLGAGAGAGEES PATSLPRMKKRDFShEQLRQYDGS RNP 
RILIAVNGKVFDVTKGSKFYGPAGPYGIFAGRDASRGliATFCLD 
KDALRDEYDDLSDLNAVQMESVREWEMQFKEKYDYVGRLLKPGE 
E PSEYTDEEDTKDHNKQD 


"~5^73 


2562 


219 


VP ART P N AEDQG P E ARAATAT PCQ S GGRE RAGEAAEDG V KMAAF 
SEMGVMPEIAQAVEEMDWLLPTDIQAESIPLILGGGDVLMAAET 
GSGKTG AFS I P V I Q I VYETLKDQQBG KXGKTTI KTGASVLNKWQ 
MNP YDRGS AFAI GS DGLCCQ 3REVKEWHGCRATKGLMKGKHYYB 
VSCHDQGLCR VGWS TMQAS LDLGTDKFGFGFGG TGKKSHNKQFD 
NYGEEFTMHDTIGCYLDIDKGHVKFSKNGKDLGLAFEIPPHMKN 
QALF PACVLKNAELKFNFGEE EFK FPPKDGFVALS KAPDGYI VK 
SQHSGNAQVTQTKFLPNAPKALIVBPSRBLABOTLNNIKQFKKY 

GKLNLSQVRFLVLDEADGLLSQGYSDFINRMHKQrPQVTSDGKR 
LQVI VCSATLHS PDVKKLS E KI MHFPTWVDLKGEDSVPDTVHHV 
WP VN PKT DRLWER LGKS H IRT DD VHAKDNTR P G ANS PEMW S SA 
I KItiKGE YAVRAI KEHKMDQAI I FCRTKIDCDNLEQYFIQQGGG 
PDKKGHQFSCVCLHGDRKPHERKCNLERFKKGDVRFLICTDVAA 
RGID1HGVPWINVTLPDEKQNYVHRIGRVGRAERMGLAI SLVA 
T BKE YHVCS SRGW3CYNTRIJCEDGGCT3 WYWEMQLLS B IEE 
HLNCTISQVEPDIKVPVDEr^GKVTYGQKKAAGGGSYKGHVDIL 
APTVQELAALE KE AQTS PLHLG YLPNQLFRTF 


5574 


1731 


952 


NEGLEVFKEQELQPSDKGAVPEDASTERSAMASLGLQIjVGYXliG 
LLGL LGTLVAMLLPS WKTSS YVGAS I VTAVGFSKGLWMECATHS 
TGITQCD1 YS TLLGL PAD IQAAQAMMVTSSA IS S LAC I IS WGM 
RCTVFCQESRAKDR VAVAGG VFFI LGGIiLGFIP VAWNLHG XLRD 
FYSPLVPDSMKFEIGEALYLGI ISSLFSLI AGI ILCFSCSCQRN 
RSNYYD AYQAQ PLATRSS P R PGQPPKVKS EFNS YSLTGYV 


5S75 


456 


766 


LLV^PCPPPTAAAVIiLSSTGLMELLEKMLALTLA>au03PRTAb 
LCSAMLLTAS FSAQQHKGS IiQKDPLLS QAC VGCLEALLDYLDAR 
SPDIGRNSPHYLMFP 


5576 


249 


2146 


RS WGAP WFWRMRLLRRRjtlMP LRLAKVtiCAFVLFLFLLHRDVSS R 
EEATEKPWLKSLVSRKDHVLDLMLEAMNNLRJ3SMPKLQIRAPEA 
QQTLFS INQSCLPGPYTPAELKPFWERPPQDPKAPGADGKAFQK 
SKWTPLBTQEKEEGYKKHCFNAFASDRISLQRStiGPDTRPPBCV 
DQKFRRCPPLATTS VI IVFHNEAWSTLLRTVYS VLHTTPAILLK 
E I ILVDDASTEEHLKEKLEQYVKQLQVVRVVRQBERKGLITARL 
X^ASVAQAEVLTFIJ}AHCECFHGWLEPI,LARIAEDKTVVVS PDI 
VTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWBTLPPHEKQRR 
KDETYPIKS PTPAGGLFS ISKS YFEHI GT YDNQMB I WGGENVEM 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=»Cysteine, D=Aspartic Acid, £= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
lis Leu cine, M»Methionine, N=A3paragine, 
Pa Proline, Q=Glutamine, R=Arginine, 
S«Serine, TsThreonine, V-Valine, 
W«Tryptophan, Y~Tyrosine, Z= Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLE1 I PCS VVGHVFRTkSPHTFPK6TSVI ARNQVR 
LAFA^WM)SYKKIFYRRNLQAAXMAQEKSPGDISERLQLREQttHC 
HNFS WYLHNVYPBM FVPDLTPTF YGAI KNLGTNQCLDVGENNRG 
GKPLI MYS CHGLGGNQ YFEYTTQRDLRHN IAKQLCLHVSKGALG 
LGSCHFTGKNSQVPKDEB WELAQDQLIRNSGSGTCLTSQD KKPA 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCSCGBISVHCLP W VLFI LDLKVB^ ^CPtKLILLP VLLD 
YSLGLNDLNVSPPELTVHVGDSALMGCVFQSTEDKCI FKIDWTL 
S PGEHAKDE YVLYYYSNLS VP I G RFQNR VHLMGDI LCNDG S LLL 
QDVQKAD^TYJCBIRLKGESQVFKKAVVhW/LPEEPKBLMVHV 
GGLIQMGCVFQSTEVKHVTKVEWIFSGRRAKEE1VFRYYHKLRM 
SVEYSQSWGHFQNRVNLVGD 1 FRNDGS IMLQG VRBS DGGN YTCS 
IHLGNLVFKKTIVLKVSPEEPRTLVTPAALRPLVLGGNQLVIIV 
GI VCAT I LLLPVL I L I VKKTCGNKS S VNSTVLVKNT KKTNPE I K 
EKPCHFER CEGEKHI YSPI I VREVTEE EEPS EKSKAT YMTMKP V 
WPSLRSDRNNSLEKKSGGGMP KTQQAF 


5S78 


3 


783 


AVESMAS PGAGRAP PELPERNCGYREVEYWDQRYQGAADSAPYD 
WFGDFSSFRALLEPELRP EDRILVLGCGNSALS YEL FLGGFPNV 
TSVDYSSVVVAAMQARYAHVPOliRWST^VRKIiDFPSASFDVVL 
EKGTLDAL1AGERDPWTVSSEGVHTVEQVLSEVSRVLVPGGRFI 
SMTSAAPHFRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 
LSVAQLALGAQ ILS PPRPPTS PCFLQDSDHEDFLSAIQL 


5579 


3 


1540 


RNSGIiARGASALARHGGGLAGGVGWDCGACASRCQGVMEGLLTR 
CRALPALATCSRQLSGYVPCRFHHCAPRRGRRLLLSRVFQPQNL 
REDRVLSLQDKSDDLTCKSQRLMLQVGliI YPAS PGCYHLLP YTV 
RAMEKLVRVIDQEMQAIGGQKVNMPSLS PAELWQATNRWDLMGK 
BLLRLRDRHG JC5 YCLG PTHEEAJ TALIASQKKLS YKQLP FXLYQ 
VTRKFRDE PRPRFGLLRGR KF YM KDMYT FDS S P EAAQQTYSLVC 
DAYCSLFWKLGI.PFVKVQADVGTIGGTVSHEFQLPVDIGEDRLA 
I CPRCSFSANMETLDLSQMMCPACQGPLTKTKG I EVGHTFYLGT 
KYSS I FNAQFTNVCGKPTLAEKGC YGLG VTRI LAAAI EVLSTED 
CVRWPSLLAP YQACLI PPKKGSKEQAASELIGQLYDHITEAVPQ 
LHGE VLLDDRTHLT IG NRL KD ANKFG YP FVI IAGKRALEDPAHF 
EVWCQNTGBVAFLTKDGVMDLLTPVQTV 


5580 


1681 


450 


ADAGTRCIPGFWPSGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAGELGMAVPAAAMGPSALGQSGPGSMAPWC5VSS 
G PS R YVLGMQEL FRGHS KTRE FIAHS AKVHS VAWS CBGRRLASG 
S FDKTAS VFI*LEKDRLVKENN YRGHGDS VDQLCWHPSNPDLFVT 
ASGDKTIRIWDVRTTKCIATVNTKGENINI CWS PDGQTIAVGNK 
DDWTF I DAKTHRSKAEEQ FKFEVNE I S WNNDNNMFFLTNGNGC 
INI LS YPELKP VQS INAHPSNCIC I KFDPMGKYFATGSADALVS 
LWDVDELVCVRCFSRXI)WPVRTLSFSHDGKMLASASEDHFID3A 
EVETGDKLWB VQCE SPTFTVAWHP KR PLLAFACDDKDG KYDS S R 
EAGTVfCL FGLPWDS 


" 5581 




947 


GGGSG PRAPSATLLDTGBS VAAVASGEDKG IAAS AAAAAVFACS 
CS PDPQ S STMNPVYSP VQ PGAP YGNPKNMAYTGYPTAYPAAAPA 
YNPS LYPTNS PSYAPEFQFLHSAYATLLM KQAWPQNSSS CGT2G 
TFHLPVDTGTENRT YQAS S AAFRYTAGTP YKVP PTQSNTAP P P Y 
SPSPWPYQTAMYPrRSAYPOXJWLYAQGAYYTQPVYAAQPHVXHH 
TTWQPNS I P S AI Y PAP VAAPRTNGVAMGMVAGTTMAMSAGTLL 
TTPQHTAIGAHPVSMPTYRAQGTPAYSYVPPHW 


CCD"} 


5775 


2739 


I ITNKNNVI IPLVI AYHLSGSAQARGERS PAERLME RQKRKAD I " 
EKGLCFIQSTLPLKQEEYEAFLLKLVQNLFAEGNDIiFREKDYKQ 
ALVQ YMEGLWVADYAASDQ VAL PRELLCKLHVNRAAClf FTMGLY 
EKALEDSEKALGLDSES IRAXFRXARALNELGRHKEAYECSSRC 
SLAL PHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSUG 
TAAG VAD QGTSNGLGS IDD I ETDCYVD PRGS PALLPSTPTMPLF 
PHVLDLLAPLDS SRTL PSTDSLDDFSDGDVFGPE LDTLLDS LS L 
VQGGLSGSGVPSELPQLIP VFPGGTPLLPPWGGS I PVSSPLPP 
ASFGLVMDPSKKLAASVLDALD?PGPTLDPLDLLPYSETRLI>AL 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CeCysteine, D*=Aspartic Acid, E= 
Glutamic Acid, PaPhenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, MsMethionine, N=Asparagine, 
PoProline, Q=Glut amine, R^Arginine, 
S=Serine, T^Threonine, V=Valine, 
W*Tryptophan, Y-Tyrosine, X«Unknown , *=3top 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PSFGSTRGSLDKPOSFMBETNSQDHRPPSGAQKPAPSPEPCMPM"" 

TALLIKNPLAATHEFKQACQLCYPKTGPRAGDnVREGLEHKCK 

RDI LLGRLRSSEDQTWKRI RPRPTKTS PVGS Y YLCKDMINKQDC 

KYGDNCTFAYHQEE ID VWTEER KGTLNRDLLFDPLGGVKRGSLT 

IAKLLKEHQGIFTFLCEICFDSKPRIISKGTKDSPSVCSNLAAK 

HSFYNNKCLVHIVRSTSLKYSKIRQFQBHFOFDVCRHEVRYGCI* 

REDSCHFAHSFIELKVWLLQQYSGMTHEDIVQESKKYWQQMEAH 

AGXASSSMGAPRTHGPSTFDLQMKFVCGQCtffRNGCWEPDKDLK 

YCSAKARHCWTKERRVIJLVMSKAKRKWVSVRPLPS IRNFPQQYD 

LCIHAQNGRKCQ YVGNCS FAHSPE ERDM WTFMKSNK I LDMQQT Y 

DMMliKKHNPGKPGEGTPISSREGEKQIQMPTDYADIMMGYHCWL 

CGKNSNS KKQ WQQH IQS EfCHKEKVPTSDS DASG WAFRF PMGEFR 

LCDRLQKGKACPDGDKCRCAHGQBEIiNEWLDRREVLKQKLAKAR 

KDMLLCPRDDDFGKYNFLLQEDGDLAGATPEAPAAAATATTGE 


5583 


3 


1265 


S SGCRQGRPGRSDRPRPP? RRHXMVXETR Y YDI LGVKPSASPEB 
IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARERRGKNW 
HQLS VTLEDLYNGVTKKLALQKNVI CEKCEGVGG KKGS VEKCP L 
CKCRGMHIHIOQIGPGNVQGIQTVCIECKGQGERINPKDRCSSC 
SGAKVIREKKIIEVHVEKGMKDGQKILFHGEGDQEPELEPGDVI 
rVLDQKDHSVFQRRGHDLIMKMKIQLSEALCGFfCKTlKTIjDNRI 
LVITSKAGEVI KHGDLRCVRDEGMP I YKAPLEKGILI IQFLVI F 
PEKHWLSLEKliPQLEALLP?RQECVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5584 


3 


126S 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMD1FDMFFGGGGRMARBRRGKNW 
WLSVTLEDLYHGVTKKljaiQKWlCEKCEGVGGXKGSVEKCPI, 
CKGRGMHIHI QQ IGPGMVQQI QTVCI E CKGQGER I NP KDRCESC 
SGAKVI REKKI IEVHVEKGMKDGQKI L FHGEGDQE PELEPGD VI 
I VLDQKDHS V FQRRGHDL IMKMKIQLS E ALCGFKKTI KTLDNRI 
LVI TS KAGE VI KHGDLRCVRDEGMP J YKAPLB KG I LI IQFLVIF 
PEKHMLSLEKLPQLEALLPPRQKVR 3 TDDMDQVELKEFCPNEQN 
WRQHRE AYEED EDG P QAG VQCQTA 


5585 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLTYATILEMQAMMTFDPQDILLAGNMMKEAaMLCQRHRRKS 
SVTDSFSShVKRPTLGQFTEEE IHAEVCYAKCLLQRAALTFLQD 
ENMVS F I KGG IKVRNS YQT YKELDS LVQS S Q YCKG3NHPH FEGG 
VKLG VGAFNLTLSML PTRI LRLLEF VGFSGNKDYGLLQLBEGAS 
GHSFRSVLCVMLLLCYHTFLTFVLGTGNVNIEEAEKLLKPYLNR 
YPKGAI FLFLAGft I E VI KGNIDAA IRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKPA1RKS 
RRYFS SNPI SLP VPALEMM YIIJNGY AVI GKQ PKLTDGILE I ITI< 
AEEKLEKGPEWEYSVDDECLVKLLKGLCLKYLGRVQEABENFRS 
I S ANE KKI KYDH Y L I PNALLEIALLLMEQDRNEEAI KLLESAKQ 
NYKNYSMESRTHPRIQAATLQAKSSLENSSRSMVSSVSL 


5586 
5587 


2*1* 
1768 


915 
148 


LPAGTPES SLHEALDQCMTALDLFLTNQFSEALS YLKPRTKESM 
YHSLT YAT I LBMQ AMMTFDPQD I LLAGN MMKEAQMLCQRHRRKS 
SVTDS FS SLVNRPTLGQFTEEE IHAEVC YAKCLLQRAALTFLQD 
EKMVSFIKGGIKVRNSYQTYEELDSLVQSSQYCKGENHPHFEGG 
VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFRS VLCVMLLLC YHTFLTPVLGTGNVN I EEAEKLLKPYLNR 
YPKGAIFLFLAGRIEVIKGNIDAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYl<GQWKMSYFYAI)Iil^KENCWSXATYIYWKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNP ISLPVPALEMMYIWNGYAVIGKQPKLTDG ILEI ITK 
AEEMLEKG PENEYS VDDECLVKLLKGL CLKYLGRVQE AEENPRS 
I S ANEKK I ICYDHYL I PNALLE LAL LLMBQDRMEEAI KLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSS LENSSRSMVSS VSL 
SSAVPDGAVGKPVAVAVGGPPHSCRCRPCCLMAAIGVHLGCTSA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M^Methionine, N=Asparagine, 
psProline, Q=Glutamine, R=Arginine, 
SaSerine, T-Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Sto? 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVYKDGRAGWANDAGDRVTPAWAxSfelsteBtvGLAAKQsai 
RNISNTVMKVKQILGRSSSDPQAQKYIAESKCLVIEKNGKLRYE 
IDTGEETKFVKPEDVARLI FS KMKETAHSVLGSDANDVVITVPF 
DFGBKQKNALGEAARAAGFNVLRLIHBPSAALIAYGIGQDS PTG 
K9N I LVFKLX3GTS LS LS VME VNSGI YRVLSTNTDDNI GGAHF TE 
TLAQYLAS EFQRS FKHDVRGNARAMMKLTNS AE VAKHSLS TLGS 
ANCFLDSLYEGGDFDCNVSRARFELLCSPLFNKCIEAIRGLLDQ 
NGFTADDINKWLCGGSSRI P KLQQLI KDLFPAVELLNS I PPDE 
VIPIGAAIEAG ILIGKENLLVEDSLM IECSARDILVKGVDESGA 
SRFTVLPPSGTPLPARRQKTLQAPGSISSVCLELYSSDGKNSAK 
EETKFAQWLQDLDKKENGLRDILAVLTMKRDGSLHVTCTDQET 
GKCEAISIEIAS 


S588 


3 


589 


T PP PPEQ AMVAATVAAAWLLLWAAACAQ QEQDFYDFKAVN I RG K 
LVSLEKYRGSVSLWNVASECGFTDQHYRALQQLQRDliGPHHFN 
VLAFPCNQFGQQEPDSNKE3ESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKY LAQTSGKEPTWNFW KYLVAPDGKWGAWDPTVS VEEV 
RPQITALVRKLILLKREDL 


5589 


1884 


553 


LRQAWHEGG IGQTDKERGAAAL PGE EGD PTRGRS LGRASW ESGS 
PRRPRSPFSSPtiPRPTr*T^5TilrAT?PP<i TTP*ni?oxnoQi.Tf2DT>r , a o^o 

GIJ^SGLWLGPDRCRPRSRCSCRVMENPSPAAALGKALCALLL, 
ATU3AAGQPLGGESICSARAPAKYSITFTGKWSQTAFPKQYPLF 
RPPAQWSSIiMAAHSSDYS MWRKNQ Y VS NGLRJDFAERG EA WALM 
KEI EAAGEALQS VHAVFSAPAVPSGTGQTS AELB VQRRHS LVS ? 
WRIVPSPDWFVGVDSLDLCDGDRWREQAAIiDI.YPYDAGTDSGF 
TFSS PNFATIPQDTVTEITSSSPSHPANSFYYPRLKALPP IARV 
TLLRLRQS PRAF1 P PA P VL V SRONE IVDS AS VPETPLDCEVSLW 
S5WGLCGGHCGRLGTKSRTRYVRVQPANNGSPCPELEEEAECVP 
DNCV 


5590 


72 


896 


LCSSGALRLLPAMVAWRSAFLVCLAFSLATLVQRGSGDFDDFNL 
EDAVKETS S VKQP WDHTlTTTTNRPGTTRAPAKPPGSGIiDlADA 
LDDQDIX3PJIKPGI GGRERWNHVTTTTKR P VTTRAPANTLGND FD 
IADALDDRNDRDDGRRKPIAGGGGFSDKDLEDI VGGGE YKP DKG 
KGDGRYGSNDDPGSGMVAEPGTIAGVASALAMALIGAVSSYISY 
QQKKFCFS I QQGLNADYVKGENLEAWCEE PQVKYSTLHTQSAE 
PPPPPBPARI 


5591 


63 


1494 


agssrraaabrjllvsagcrslagrasgvlllpaelLpgeeeama 

LRVTRNS K INAEKKAKI NMAG AKR VPTAP AATS KPGLR PRTALG 
DI GNKVS BQLQAKMPMKKEAKPS ATGKVI DKKLPKPLEKVP MLV 
P VPVSEPVPEPBPEPE PEPVKEEKLS PE P IliVDTAS P S PMETSG 
CAPAEEDL CQAT3DVILAVNDVDAEDQAD PNL CS EYVKDX YAYL 
RQLEEEQAVRPKYLLGREVTGNMRAILIDWLVQVQMKFRLLQET 
MYMTVS I IDRFMQNNC VPKKMLQIAf GVTAMF IAS KYEEMYPP E I 
GDFAFVTDNTYTKEQIRQMEMKILRALNFGIiGRPLPLHFTjRRAS 
KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
ILDNGE WTPTLQH YLS YTEESLL PVMQHLAKNAAMVNQGLTKHM 
TVKNKYATSKHAKI STLPQLNSAIATQDLAKAVAKV 


5592 


" 242 


924 


YGES KDWNQKDLLSALVXjTTVNCLPTP X MAKSAEVKIAIFGRAG 
VGK5ALWRFI*TKR FI HEYDPTLES TYRHQATIDDBWSME ILD 
TAGQEDTIQRBGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
IKKP^TLILVGWKADLDHSRQVSTEEGBKXJVTELACAFYECS 
ACTGEGNITEI FYE LC RE VRRRRMVQ GKTRRRS STTHVKQAIK K 
MLTKISS 


' 5^93 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESBESGDBEGKKK~~ 
SSGIVADIjSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVDIiNHYRIGKIEGFEVLKKVKTLCLRaNLIKCIENLEELGSLR 
BLDLYDNOX KKIENLEALTEIiEI LDISFNLIiRNIEGVDICLTRLK 
KLFLVNNKI S KIENLSNLHQLQMLEIjGS MR I RAI EN J DTLTNLB 
SI»FI«GKNKITKLQNIiDALTNLTVLSMQSNRl.TKrEGIiQNliVNLR 
ELYLS HNG I EVI BGLENNNKLTMLDI ASNR 1 KK I ENI S H LTELQ 
EFWMNDNLLESWSDliDELKGARSLETVYLERNPLQKDPQYRRKV 
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I SBQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Pnenylalanine, G»Glycine, 
H=Histidine, Ielsoleucine, K=Lysine, 
LoLeucine, M=Methionine, tfoAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Scrine, T~Threonine, V=Valine, 
W= Tryptophan, Y»Tyxosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion) 








MLALPSVRQlhATPVRP 


-S594 




1113 


HASGGRAAKMAAERGAGQQQSQEMMEVDRRVES2ESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEEDPBE2HELPVDMETim*DRDAE 
DVDLNHYRIGKIEGFEVLKKVKTLCiRCNLIKCIENLBELQSLR 
ELDLYDNQ I KKI ENLB ALTELEI LD I SFNLLRNI EGVDKLTRLK 
KLFLVNNKI S KI ENLSNLHQIiQMLELGSNR IRAISNI DTLTNLE 
SLFLGKNKITKLQNLDALTNLTVLSMQSNRLTKI EGLQNLVNLR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKIENISHLTBLQ 
EFWMNDNLLBSWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQJPATFVRF 


5595 


3 


1476 


ARWNGRW VQ VPAW PGPG CGTNASGERQRQLPRAWRPVGRTLGSE 
PlALAWSPPliYLFPIPI*PSWAVSQPTPTLGTMFAI)t»DYDIESDK 
LG I PTVPGJCVTiiOJCnAflivn . Trs t q i czizniinvr t 'Dni JVT\?r\\/T>T\xtT'n 

AALDGT VAAG DE I TGVNGRS I KGKTKVE VAKM IQEVKGEVTIHY 
NKLQADPKQGMSLDIVLKKVKHRLVENMSSGTADALGLSRAILC 
^GLVK^LEELERTAELYKG>TrEHTKNLLRAPYEL3QTHRAPGD 
VPSVlGVREPOPAASEAFV^FADAWR<lTFTn?RTP~.r.vn*TirDvrT t 

DLNTYtiNKAI PDTRLTI KKYLDVKFE YIiS YCLKVTCEMDDEEYSC 
IAIX3EPLYRVSTGNYEYRLILRCRQEARARFSQMRKDVLEKM2L 
LDQKHVQDI VFQLQRLVSTMS KYYNDCYAVIjRuAD VF PI BVDIA 
HTTIAYGI^QEEFTOGEEE^EEEiyrAAGEPSRDTRGAAGPLDKG 
GSWCDS 




698 


219 


GAVLAPSSLPAAEIAAOGESOSLtBDI* qmtqb PT<jr wvt c b* t pd" 
NGDKYDGIX^R'TSSGIYl^GIGIHTTP^IVyTOSMKDDKMNG 
FGRLEHFSGAVYEGQFKDNMFHGI^TYTFT^GAKYTGNFNEIIRV 
KGEGEYTHIQGTRMDWTFHFTSCSQ? 


5597 


3 


731 


ISCKMAAIX5QSSLPASWRSVTLTHVEYPAGDI>SGHLLAYLS1>SP 
VFVI VGFVTLI I FKREl^tTIS FLGGLALNEG VNWL I KNVIQEPR, 
PO^PHTAVGTKYGMPSSHSQFMWFFSVYSFI^LYLRMHQTNNA 
RFLDLLWRHVLSLGLLAVAPIiVSYSRVYXVLYHTWSQVLYGGIAG 
GLMAIAWFIFTQEVXiTPLFPRrAAKPVSEFFLlRDTSLIPKVLW 
FEYTVTRAEAKNRQRKLGTKLQ 


5598 


326 


2440 


GIGPIAASFIPCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLS FKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMDCGGFYPRLSCCLRSDS PGIiGRLENXX FSVTNNTECGKLLEE 
IKCALCS PHSQSLFHS PEREVLERDLVLPLLCKDYCKEFFYTCR 
GHIPGFLQTTADBFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMBE YDKVEEI SRKHKHNCFCIQEWSGLRQPVGAIiHSGDGSQR 
L FILE KE G YVK 1LTPEGE I FKEP YLD1 HXLVQSGI KGGDERGLI* 
SLAFHPN YKKNG KLYVS YTTNQ3RWAI GPHDH ILRWEYTVS RK 
NPHQVDLRTAEVFLEVAELHRKHLGGQLIiFGPDGFLYI I LGDGM 
ITLDDME EMDGI^DFTGS VLRLDVDTDMCNVPYS I PRSNPKFNS 
TNQPPBVFAHGLHDPGRCAVDRHPTDI NINLTILCSDSNGKNRS 
SARILQX I KGKDYESE PSLLBFKPFSNGPLVGGFVYRGCQSERL 
YGS YVFGDRNGNFLTLQQS PVT2CQWQB KPLCLGTSGSCRG YFSG 

H IIjGFGEDEIiGEVYI lsss ksmtqthngklyki vdpkrplmpee 

CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5599 


32* 


2440 


gigpiaasfifckvaslyiflsppppsvsgvpyspai^sswscAl 
vplixsgvpphppapspccsgqtmi^i^fkliii^lavalgffeg 
dakfgepjtegsgarrrrclngnppkrlkrrdrrmmsqlellsgg 

EMLCGGFYPRLSCCLRSDSPGLGRLENKI FSVTNNTECGKLLEE 
IKCALCSPHSQSLFHSPEREVLERDLVLPLLCKDYCKEFFYTCR 
GHI PoFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQBVVSGLRQPVGALHSGDGSQR 
LFTLEKEOYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
SLAFHPNYKKNGKLYVS YTTNQBRWAIGPHDHI LRWEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI ILGDGM 
I TLDDMEEMDGLS D FTGS VLRLD VDTDMCNVP YSIPRSNPHFNS 
TNQ P PE VFAHGLHDPGR CAVBRHPTDIN INLT ILCSDSLNG KNRS 
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SEQ 
ID 
NO: 


4T JL. CU .L. V* *— G V* 

beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rrcuiCLca cud 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=»Alanine, C=Cysteine, DsAspartic Acid, E- 
Glutamic Acid, F»Phenylalanine, G«*Glycine, 
H=Histidine, I-Isoleueine, K= Lysine, 
L» Leucine . M=Mafchionin** MnA<:n^ rarr i r»~ 
P«Proline, Q=Glutamine ( R=Arginine, 
S=Serine, T=Threonine , VoValine, 
W«Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSElRIi " 
YGSr»/FGDRNGNFLTLQQSPVTKQWQBKPLCLGTSGSCRGYFSG 
HILGFGEDBLGEVYILSS S KSMTQTHNQKLYKI VDPKR PLMPEB 
CRATVQPAQTLTS ECSRLCRNGYCTPTGKCCCS PGWEGDFCRTG 


5600 


1977 


1244 


SLRVLSGHLMQTRDL VQP D KPAS PKF I VTLDG V PS PPG YMSDQB 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 

FPNCKFAEKCLFVHPNCKYDAKCTKPDCP FTHVSRRI P VLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFyHPKHCRFNTQCTRPDC 
TF YHPTI NVP PRHALKWI RPQTS E 


5601 


1977 


X6*t*i 


SLRVLSGHLMQTRDLVQ PDKPAS PKFI VTLDG VPS PPG YMSDQE 
EDMCKEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHP ISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRI P VLS PKP 
AVAPPAP PSSS QLCRY F PACKKME CPF YHPKHCRFNTQCTRPDC 


5602 


246 


766 


YHTSCTVWRTAKEALENTEVPVGCLMVYNNEWGKGRNEVNQTK™" 
NATRKAEMVAIDQVLDWCRQSGKSPSEVFEH'rVLYVTVEPCIMC 
AAALRLMK IPLWYGCQNER FGG CGS VIiNI AS ADLPNTGRP FQC 
IPGYRAEEAVBMLKTFYKQENPNAPKSKVRKKECQQILNMF 


5603 


1 


565 


PRGRT P I SGGEKG CAQY P I PATPARS GENRTM PGAGDGGKAPAR 
WIX5TGLLGLFLLPVTLSLEVS VGKATD I YAVNGTE 1 LLPCTFSS 
CFGFBDLHFRWTYNSSDAFKI LIEGTVKNEKSDPKVTLKDDDR I 
TLVGST KEKRNN 1 S I VLRDLE 7SDTG K YTCHVKNP KENNLQHHA 
T IFLQ WDRRMQ 


5604 


1 


1506 


EDIFPAQLLKLQRHERVWQQBPPVRDHRSWGGSGAGGVAGREWT " 
DQGQVALGGH YMAEGEG YFAMS EDELACS P YIPLGGD FGGGDFG 
GGDFGGGDFGGGDFGGGGSFGGHCLD YCES P TAHCNVLNW E QVQ 
RLDG ILSET I P IHGRGNFPTLE LQPS LIVKVVRRRIiAEKRI GVR 
DVRLNGSAASHVLHQDSGLGYKDLDLIFCADLRGEGEFQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDSLRRQFEFSVDS FQ I KLDS LLLF YECS E 
NPMTETFHP XI IGB S VYGD FQEAFDHLCNK I 1 ATRNPE E IRGGG 
LLKYCNLLVRGFR PASDEI KTLQRYWCSRFFIDFSDIGEQQRKL 
ESYU^FVGLEDRKYEYI>MTLHGVVNESTVCLMGHERRQTLNL 
ITMLAIRVLADQNVI PNVANVTC Y YQ PAP YVADANFSN YYIAQV 
QPVFTCQQQTYSTWLPCN 


5605 


35 " 


1621 


SQRSCPRS PSSPAPPWARCSNPDS RTGGVP VPRAWSAGGPALGL 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRSLRRYPLPIiRSGKEAKILQHFGDGLCRMLDERLQRHRTSG 
GDHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQPKAGGSGSYWP " 
AAnounn v x uxt v u x K£* tiuri f£i Kjtinc Li X xvc. £AjL> \j KLLAy Kb PKVAP 
GSARP WPALRS LLHRNL VI «RTHQ PARYSLTPEGLELAQKLAESE 
GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQOPLELRP 
GEYRVLLCVDIGETRGGGHRPELLRELQRLHVTHTVRKIiHVGDF 
VV7VAQBTNPRDPANPGELVLDH1VERKRLDDLCSSIIDGRFREQ 
KPRLXRCGLERR VYLVEEHGS VHNLS LPE5TLLQAVTNTQVI DG 
FFVKRTAD I KES AAYLALLTRGLQRL YQGHTLRSRPWGTPGNPE 
SG AMT5 PN PLCS LLTFS DFNAGAIKNKAQS VREVFARQLMQ VRG 
VSGE KAAALVDRYSTPASLLAAYDACATPKBQETLLSTIKCGRL 
QRNLGPALS RTLSQL YCS YG PLT 


5606 


3 


1099 


GRSRCPGPGARGGTMSPRSCLRSLRLLVFAVFSAAASNWLYLAK 
LSS VGS ISEEETCEKLKGLIQRQVQMCKRNLEVMDSVRRGAQLA 
IBECQYQFRNRRWNCSTLDS LFVFGKVVTG^TREAAFVYAI SSA 
GVAFAVTRACS SGELBKCG CDRTVHGVSPQGFQWSGCS DN I AYG 
VAFSQS FVDVRERS KGAS S S RALMNLHNNEAGRKAILTHMRVEC 
KCHGVSGSCB VKTCWRAVP PFRQVGHALKEKFDGATE VB PRRVG 
SSRALVPRNAQFKPHTDEDLVYLEPSPDFCBQDMRSGVLGTRGR 
TCNKTSKAIDGCELLCCGRGPHTAQVELABRCS CKFHWCCFVKC 
RQCQRLVELHTCR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, e* 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
I*=Leucine/ M= Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T»Threonine, V»Voline, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


'^07 


521 


141 


PPVCNPAEAMPSPGTVCSLLLU»ir.WLDLAMAGSSFliSPEHQg7~ 
QQR KBSKK P PA KLQ PRALAG WLRP EDGGQ AEGAEDELE VRFNAP 
FDVGI KI*S G VQYQQH SQALG KFLQD I LW EE AKBAPADK 


5600 


2 


983 


wfqsplrqadpgpprhtlfmdfvagaiggvggdaVgypldtvkv 
riqtepkytgxwhcvrdtyhrbrvwgfyrglllpvctvslvsse 

VFGTYRHCLAHlCRLRFGNPDAKPTKADITIiSGCASGLVRVFLT 
S PTEVAKVRLQTQTQAQKQQRRLSASGPIAVPPMCP VP PACP B P 
KYRG PLHCLATVAREEGLCGLYKGSS ALVLR DGHS FATYFLS YA 
VLCE WLS PAGHS RP DVPGVLVAGGCAGVLAWAVATPMDVIKSRL 
QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVUICCRAFPVN 
MWFVAYEAVLRLARGLLT 


5609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGLRRGRSGTSWRPRRMNHKSK 
KR IREAKRSARPELKDS LDWTRHNYYES FSLS PAAVADNVERAD 
ALQLS VEE F VER YER P YK PWLLNAQ 3GWS AQEKWTLERLKRKY 
RNQKFKCGEDNDGYSVKMKMKYYIEYMESTRDDSPLYIFDSSYG 
EHPXRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTGIHIDPLGTSAWNALVQGHKRWCLFPTSTPRELIKVTRDEGG 
NQQDEAITWFNV I YPRTQLPTW P p E FKPLE 1 LQKPGETVFVPGG 
W WH WLNLDT T IAI TQN FAS S TN F P WWHKTVRGR P KLS RKW YR 
ILKQEHPELAVLAD3 VDLQE STG I ASDSSSDSSSS S SSS S SDSD 

SECESGSEGDGTVHRRKKRRTCSWVGNGDTTSQDDCVSKBRSSS 
R 


5610 


54 


1196 


LERTPAS AbMAWTKYQLFLAGLML VTGS INTLSAXWADN FKAEG 
CGGSKEHSFQHPFLQAVGMFLGEFSCLAAFYLLRCRAAGQSDSS 
VDPQQPFNPLLFLPPALCDMTGTS LMYVALNMTSASSFQMLRGA 
VI I F1X3LFS VAFLGRRLVLSQWLG ILATI AGLVWGIADLLSKH 
DSQHKLS EVI TGDLLI I MAQI 1 VA I QMVLEEKFVYXHNVHPLRA 
VGTEGL FGFVI L S LLIiVPM Y YI PAGS F S GNPRGTLEDALDAFCQ 
VGQQPLI AVALLGN1SS I AFFNFAGI SVTKELSATTRMVLDSLR 
TWIWALSLALGWEAFHALQI LGFLILLIGTALYNGLHRPLLGR 
LSRGRPLAEES EQERLLGGTRTPINDAS 


5611 


2 


577 


FVLPNRLGIPGS TFRGPGACAS SSSLAASAKPGAGGS PALAMSG 
BLSNRFQGGKAFGLLKARQERRLAE3NREFLCDQKYSDEENLPE 
KLTAFKE KYMB FDLNNEGEI DLMSLKRMMEKLGVPKTHLEM KKM 
I SE VTGGVSDTI S YRDFVNMMLGKRSAVIiKLVMMFEGKANESSP 
KPVGPPPERDlASIiP 


5612 


1 


721 


ASRDG YMDATIAPHRI PPEMPQYGRENHI FELMQAMWLCKHLNS 
S LLTL ENLI LNEFS YTATEARRLYLQRKT VPSALLVQLIQERLA 
EEDCIKQGWILDCI PETRBQALRIQTLGITPRHV JVLSAPDTVL 
t ERNLGKRIDPQTGE I YHTT FDWPPESEI QNRLMVPEDISEIjET 
AQRLIiEYHRNI VRV I PS Y PKI 1>KV I SADQPCVD VF YQALTYVQS 
NHRTNAPFTPRVLLLGPVGS 


5613 


115 


1279 


RGVDPALRRAEKMLPLSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRKLFFFLCLNLSFAFVELLYGIWSNCLGLISDSFHMFFDST 
AI LAGLAAS VI S KWRDNDAFS YG YVRAEVLAGFVNGLFLI PTAF 
FI FSEGVERALAPPDVHHERLLLVSILGFWNLIGI FVFKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFHSHDGPSLKETTGPSRQILQGVFLHILADTLGSIGVI 
ASAir^MQNFGLMlADPICSILIAILIWSVlPLLRBSVGlLMQR 
TPPIiLBNSLPQCYQR VQQLQG V YS LQEQHFWTLCS D VYVGTLKL 
I VAP DAD ARW ILSQTHN I FTQAG VRQL YVQ I D FAAM 


5614 


3 


iTFSe 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER 
APWGARQRLG VMAELQQLQE PE I PTGREALRGNHS AUUR VAD YC 
EDNYVQATD K R KALE E TMA FTTQALAS VA YQ VGNLAGHTLRM LD 
LQGAAtJ^QVEARVSTLGOMVNMHMBKVARREIGTIJVTVQRLPPG 
QKVIAPENLPPLTPYCRRPLNFGCLDDIGHGIKDLSTQLSRTGT 
LSRKSIKAPATPASATLGRPPRIPEPVHLPWPDGRLSAASSAS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPAAVEVF 
QRPPTLEBLSPPPPDEELPLPLDLPPPPPLDGDELGLPPPPPGF 
GPDEPSWVPASYI^KVVTLYPYTSQKDNELSFSEGTVICVTRRY 
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SEQ 
ID 
MO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C~Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VeValine, 
"tryptophan, Y=Tyrosine, X-Unknown, *«*stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 
SDGWCEGVSSEGTGFFPGNYVEPSC ' ' — 


5615 


9 


1558 


ai^rrrpgdpremeaaatpaaagaakkeeldmdVmrpLINeQnf~ 

DGTS DE EHEQELL PVQ KHYQLDDQEGI S F VQT1MHLLKGNIGTG 
LLGLPLAIKNAGIVLGPISLVPIGIISVHCMHTr vrrsr-cucT m D 
FKKSTLG YSDTVS FAM2 VS P WS CLQKQAAWGRS WD FFLVITQL 
GFCSVYIVFLAENVKQVHEGPLESKVPISNSTNSSNPCBRRSVD 
LRIYMLCFLPFIIUjVPIRBLKNLFVLSFLANVSKAVSLVIIYQ 
YWRNKPDPHNLPIVAGWKKYPLFFGTAVFAFEGIGWLPLENQ 
MKESKRFPQALNIGMGIVTTLYVTLATLGYMCFHDBIKGSITLN 
L PQDVWL YOS VKT L YS FGT PV/TV Q TfYPWDTv pttt nr« ttp rrrr,^ 

KWKQ I CE FG I RS FLVS I TCAGAI L I P RLD IVI SFVGAVS SSTLA 
L ILPPL VEI LTFSKSH YN1 WMVLKNI S I AFTG WGFLLGTY I TV 
BEriYPTPKWAGTPQSPFLNLNSTCLTSGLK 


5616" 


1 


719 


DD FVROG PQ S AAMGAS ARLLRAV I MGAP GSGKGT VS S R I TTH FE 
jj±\±Lj-i+jsj^isxj±jj\ui.Hi w \iji\\3 x oxvj v wiTtAr iUlvGKXilPDuVMTRLAli 
HELKNLTQYSWL^GFPRTLPQAEALDRAYQIDTVINLNVPFEV 
I KQRLTARW IHPASGRVYNTEFNP PKTVG I DDLTGE P L I QREDD 
KPETVIKRLKAYEDQTKPVLEYYQiOCGVLETFSGTETNKIWPYV 
YAFLQTKVPQRSQKASVTP 


5617 


176 


765 


P WRGR GS R PRG AG AMAE E Q VNR SAGLAPDCEASATAE XT VS SVG 
TCEAAGKSPEPKDYDS TCVFCR IAGRQDPGTEliLHCENEDLICF 
KDI KPAATHHYLWPKKIirGNCRTLRKDQVELVENMVTVGKTIL 
ERNNFTDFTNVRMGFHMPPFCS ISHLHLHVLAPVDQLGFLSKLV 
YRVNS YWF I TADHL I E KLRT 


5618 

* 


3 


1692 


YI^YINL^EWKLSGKEDLWEKI^YLWKSTLNLPEDLLRVPDES 

LFLNSGGDSLKSIRLLSEIEKLVGTSVPGLLEI ILSSS ILEI YN 

HlLQTVVPDEDVTreKSCATKRKLSNINQEEASGTSLHQKAr>rr 

r i. utunniruir v Vij&KVji>yiL»bijNb rRFLVTKLGHCSSACPSDS VS 

QTN IQNLKGLNS P VLI G KS KDPS CVAKVSE EGKP AIGTQKME LH 

VRWRSDTGKCVDASPLWI PTFDKSSTTVYIGSHSHRMKAVDFY 

SGKVKWZQIIiGDRIESSACVSKCGNFIWGCYNGLVYVLKSNSG 

EKYVmFTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKKC 

VWKSKCGGTVFSSPCLNLI PHHL YFATLGGLLLAVN P ATGNVI W 

KHSCGK^LFSSPfV , PQrtYTr , Tf^r'TJTV" , wr r r^'crnzi-arfct^TT^^rM^mn 
»uikJx.\7o. xji jjry^Loy -I J. u J.^uviA^W^jJjL.x* lHrGisQVWQFSTS 

GP 1 FS S PCTS PS EQKI FFGSHDC F I YCCNMKGHLQVf KFETTS R V 

YATPFAFHJfYNGSNEWLLAAASTTOKVWILESQSGQLQSVYELP 

GEVFSSP WLESMLI IGCRDNYVYCLDL1/3GNQK 


5619 


2160 


1477 


DSPVL PTSGNVI STAQPAQP WSAVEAALRSiiGS PPGAGRGCP CP 
AQSIiHSHQLAAWDPLKPSLRSYPPHLLQHPQLRSLTASSGHLGR 
RSC PQ PRP LEELLRAGS STRPQPLTSS CCGMS CM ys pt r»w r q vt 
LWGTKGRGSGSP5 S PGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 
TVGLRPGLU3ERQRGALRAGDPQCQCPLPATVR EDLGVPSP WAA 
ECSPPATP 


5620 


930 


182 


PLPPPTIAMFLTRSEYDRGVNTFSPEGRliFGVE YAI EAI KLGST • 

AIGIQTSEGVCLAVEKRITSPIMEPSSIEKIVEIDAHIGCAMSG 

LIADAKTLIDKARVEXQNHWFTYNETMTVESVTQAVSMLALQFG 

EEDADPGAMSRPFGVAIiLFGGVDEKGPQLFHMDPSGTFVQCDAR 

AIGSASEGAQSSLQEVYHKSMTLKEAIKSSLIILKQVI^EEKLNA 

TNIEIiATVQPGQNFHM FTKEE LE E VIKD I 


5621 


3 


819 


WEF VE YTATDANVKNESLSS VQQLGI KMTVRYGKFLjS IiLKDGA 
ENDLTWVLKHCERFLKQQQTSIKSSLLCLQGNYAGHBWFVSSLF 
MIMLGDKEKTFQFIiHQFSRLLTSAFLWLPRLHISSYLPNDTVES 
GrHPVYFCSTHYlBMLLKAELPLVFSAFHMSGFAPSQICLQWIT 
Q CFWNYLD WI E1CH Y I ATCVFLGPDYQVYI CIAVPKHLQQDILQ 
HTQTQDLQ VF1»KEEALHGFRVSD YFE YME I LEQNYRT VLLRDMR 
NIRCiQST 




1122 


456 


AASTKDAVSRKRSHSASEKSGTGTSISKRLNl^NPQIRNPMKAMY " 
PGTP YFQFKNLWBANDRNETWLCFTVEGI KRRS WS WKTGVFRN 
QVDSETHCHAERCFl^SWFCX)DILSP^mCYQVTWYTSWSPCPDCA 
GEVAEFLARHSNVNLTI FTARL YYFQ YP C YQEGLRSLS QEG VAV 
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— se3~" 

10 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

1 oca h i nn 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticfe""" 
(A« Alanine, O Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl a la nine, G=slycins, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methion±ne, NoAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
WaTryptophan, Y-Tyrosine, Xotfnknown, +=Step 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








-cur iv x uwcwr v x MW1S*C JtPWKGLKTNFRLLKRRLRESL 

Q 


5623 


3 


954 


FLP FF I RA P KI S RNG QWLFTFTTP FP FANKAL PG WEG I VPACFW 
RKKILTPSTGTMELLQVTILFIXPSICSSNSTGVLEAANNSbW 
TTTKPS ITl'PNTESIiQKKVVTPTTGTTPKGTlTNEIiIiKMSIMST 
ATFLT3KDBGLKATTTDVRKNDS 2 ISNVTVTS VTLPNAVSTLQS 
S KPKTE TQS S I KTTB I PGS VLQPDASPSXTGTLTSI P VT I PENT 
SQSQVIG7BGGKNASTSATSRSYSSIILPWIALIVITLSVFVL 

VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHE5GBH 
SAQGKTKN 


5624 


159 


898 


PGVAAAAGALPQYrtG PAPAL VSCRRELSLSAGSLQLERKRRDFT 
SSGSRKLYFDTHALVCLLEDNGFATQQAEI I VSALVKILEANXD 
IVYKDMVTKMQQEITF0QVMSQIANVKXDM1ILEKSEFSALRAE 
NEKIKLELHQLKQQVMDEVI KVRTDTKLDFNLEKSRVKELYS LN 
EKKLLELRTEIVAIiHAQQDRALTQTDRKIETEVAGLlCrMLESHK 
LDNI KYLAGS I FTCLTVALGFYRLWI 


5625 


1 


1180 


TIP S S AAA^ RAG P PAGAL EALS PGGARAHAE RRG BMRAT P LAAP 
AGSLSRKKRLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATASRU3PYVLL3PEEGGRAYQALHCPTGTBYTCR 
VYPVQEAIAVLEPYARLPPHKHVARPTEVLAGTQLLYAFFTRTH 
GDMHS LVRSRHR I PE PEAAVL FRQMATALAHCHQHGLVLRDLKL 
CRFVFADRERKKLVLENLEDSCVLTGPDDSLWDKHACPAYVGPE 
ILSSRASYSGKAAPVWSLGVALFTMLAGHYPFQDSEPVLLFGKI 
RRGA Y ALPAG LSAPARCIiVRCLLRR E PAERLTATG I LLHP WLRQ 
DPMPLAPTRSKLWEAAQWPDGLGLDEAREEEGDREWLYG 


5626 


3123 


2011 


e tfKALG S VAMENQ VLTPH VYWAQRHRE LYLRVELSDVQNPAI S I 
TENVLH FKAQGHGAKGDNV YEFHLEFLDLVICPE PVY KLTQRQVN 
ITVQKKVSQWWBRLTKQEKRPliFJjAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGSPETLTNLRKGYLFMYNLVQFLGFSWIFVN 
LTVR FC ILG KES FYDTFHTVADMM YFCQMLAWETI NAAIGVTT 
S PVLPSLIQLLGRNFILFII FGTMEEMQNKAWFFVFYLWSAIE 
I FRYS FYMLTC 1 DMD WKVLTVJLRYTLWIPLYPLG CLAEAVS VI Q 
SIPIFNETGRFSPTLPYPVKIKVRFSFFLQIYLIMIFLGLYINF 
RHLYKQRRRRYGQKKKKIH 


" 5*27 


3121 


2011 


PPRALGSVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAisi 
TENVLHF KAQGHGAKGDNVY S FHLEFLDLVKPEP VY KLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EESRLNKLRIjBSEGSPETLTNLRKGYLFMYNLVQPLGFSWIFVN 
LTVRFC I LGKBS FYDT FHTVAJDMM YFCQ MLAWE T I NAA I GVTT 

SPVLPSLIQLLGRNFILFIIFGTMBEMQNKAWFFVFYLWSAIB 
I FRYS FYMLTC IDMDWKVLTWLRYTLWI PLYPLGCLAEAVSVIQ 

SIPIFNETGRFSFTLPYPVKIKVRFSFPLQIYLIMIFLGLYINF 
RHLYKQRRRRYGQKKKKIH 


5628 


75 


1455 


vagamaskclkagfssgslkspggasggstrvsamyssspckXp " 

S LSP VARS FS ACS VGLGRS S YHATSCLPALCL PAGGFATS YSGG 

ggwfgegiltgneketmqslndrlagylekvrqleqenaslbsr 
irewceqqvpymcpdyqsyfrtieelqkktlcskaenarlwei 

DNAKLAADDFRTKYETEVSLRQLVESDINGLRRILDDIjTLCKSD 

leaqveslkeellclkknheebwslrcqlgdrlnvevdaappv 
dlnrvleemrcqyetlvennrrdaednldtqseelnqqvvssse 
qlqscqaeiielrrtvnaleielqaqhsmrdalestlaeteary 

SSQIAQ^QCWITN^AQLAEIRADLERQNQEYQVIiD\rRARLEC j 
EINTYRGLLESEDSKLPCNPCAPDYSPSK5CLPCLPAAS cgpsa 

artncsarpicvpcpggrf 


5629 


2287 


938 


GRPR6SSDNRNFLRERAGLSSAAVQTRIGNSAASRRSPAARPPV~ 
PAPPALPRGRPGTEGSTSLSAPAVLWAVAVWWVSAVAWAMA 
NYIHVPPGS PE VPKLNVTVQDQEEHRCRBGALSLLQHLRPHWDP 

qevtlqlftdg itnkl igcyvgntmed wlvri ygnktellvdr 
deevksfrvlqahgcapqlyctfnnglcyefiqgealdpkhvcn 

PAIFRLI ARQLAKI HAI HAHNGWI PKSNLWLKMGK YFSL I PTGF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ocyiiicju ^-vtitaiiiiiig sign a J. peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, lULysine, 
L=Leucine, M=Methionine, N=*Asparagine , 
P=*Proline, Q=Glut amine , R=Arginine, 
S-Serine, T»Threonine , V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion , 
\=possible nucleotide insertion) 








ADEDINKRt'LSDIPSSQIM3k£MTWMkEILSra^SPVVLCHNDin 
LCKNIIYNBKQGDVGPIDy^YSGYNYLAYDIGNHFNEFAGVSDV 
DYSLYPDRELQSQWLRAYLSAYKEFXGFGTEVTEKEVE I LPIQV 
NQFALASH P FWGLWALI QAKYSTI E PDFLG YAI VRFNQ YFKMKP 
BVTALKVPB . 


5630 


1194 


278 


GFWAIAOT(^mr J ppGSPWLVPASPWRIiPEMSSFGYRTLTVALF| 

TLICCPGSDEKVFEVHVRPKKLAVEPKGSLEVNCSTTCNQPEVG 

GLETSLDKIU,DEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 
KSNVSVYOPPROVIliTLOPTTkVaVfiTf QPTT"pr»Dirta'mri7T»T nevr ™ 1 

LFLFRGNETLKYETFGKAAPAPQEATATFNSTADREDGHRNPSC 
LAVLDLMSRGGNIFHKHSAPKMLBI YE PVSDSQMVI I VTWSVL 
LSLFVT5VLLCFIFGQHLRQQRMGTYGVRAAWRRLPQAFRP | 


5631 


1053 


290 


SRVDDFVRPEPSRAEPSRSGRRRPARRAAT^SVFtiiOiFGAGGGKj 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAXKHG 
TKNKRAALQALKRKKRYEKQrAQIDGTLSTIBFQREALENANTN 
TE VLKNMGYAAKAMKAAHDNMDIDKVDELMQD I ADQQELAEE IS 
TAISKPVGFGBRPDP!nPT.NIlkT7T.I?PT.'n , nT7E»T , nvkTT r pTo^npimm 1 

LPNVPS IALP3KPAKKKEEEDDDMKELENWAGSM | 


<S«32 


3 


952 


wlgwspprrlwwgslgaaqrpavpvsglarslh^trrphrraH 
ftfvs sadaedls gs i as pdvkxnlggdfi kestattflrqrgy 

GWLLEVEDDDPEDNKPLLEEI^IDLIO^IYYKIRCVLMPMPSLGF 
NRQ WRDNPDFWG P LAWLFFS MI SLYGQFRWS WI ITI W I FGS 
LTIFLLARVLGGEVAYGQVLGVIGYSLI.PLIVIAPVLLVVGSFE 

FLSLYTGV j 


5633 


771 


460 


qgcsktmsvgrpfyrssefmeqllsshlhqvpffccft^cLcnH 

CLFENS VS KLYMLCFNFFMS I FFYSLS I TKLNI*I YLWGL S YQS U 
LLLLIiSGHRPWGSSMV | 


5634 


1446 


855 


FKATG R i rsraaas rpragag asgae prsgrbrsrlsgrrapam j 
arntlssrfrrvdidefdenkfvdeqeeaaaaaaepgpdpsevd 
gcilrqgdi4lrafhaat»rns p vntknqavkeraqgwlkviitnfk 

SSEIEOAVOSLDRN'GVniiriM5rYTYTf<^T?l7lfDTCTacGft^rr Tnuutu/ 
ALAVGGLGS I IRVLTARKTV j 


5635 


3 


• 943 


DRGPRSTATOTGRARVSFWRFPLDPGVKWSNVQISGEKRRFRTL H 
RSLFHP FP VTRSGAPRAVLVGSS W PAKMVAPAVKVARGWSGLAli 
GVRRAVLQLPGLTQVRWS RYS PE FKDPLIDKEYYRKP VEELTEE 
EKYVREI*KKTQLIKAAPAGKTSSVF^DPVISKFTNMMMIGGNKV 
LARSLM I QTLEAVKRKQFEKYHAASAEEQAT IERNP YTI FHQAL 
KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRFLAMKWMIXECRD 
K KHQRTL M P EKLSHKL L EAFHNQ G P VI KR KHDLHKMAKANRALA 
HYRWW 


5636 


2253 

OAQ 


1143 


l^DTICQHPPAEKKLYLYHRKLREVERNGIPRLPKDVPMOTHQGl 
LTDVRAKVTGFSEG WDS VKGGFS S FSQATHSAAGAWSKPRB I 
ASLIRNKFGSADNI PNLKDSLEEGQVDnAGKALGVI SNFQSS PK 
YGSEEDCSSATSGSVGANSTTGGIAVGASSSKTNTLDMQSSGFD 
ALLHEIQEIRETQARLEES FETLKEH YQRJD YSL IMQTLQEERYR 
CERLEEQLNDLTELHQNE I LNLKQ ELASME BKI AYQS YERARD I 
QE ALEACQTR I S KMELQQQQQQ WQLEGLENATARNLLGKLIN I 

LLAVMAVLLVFVSTVANCVVPLMKTRNRTFSTLFLVVFIAFLWK 
HWDALFSYVERFFSSPR 


5637 


948 


2532 


MSFCGARANAKMMAAYNGGTSAAAAGHHHHHMHHLPHLPPPHLH | 

HHHHPQHHLHPGSAAAVHPVQQHTSSAAAAAAAAAAAAAMLNPG 

QQQPYFPSPAPGQAPGPAAAAPAQVQAAAAArVKAHHHQHSHHP 

QQOLDIEPDRPIGYGAFGWWSVTDPRDGKRVALKKMPNVFQNL 

VSCKRVFRELKMLCFFKHDNVLSALDI LQP PHI DYFEB I YWTB 

LMQSDLHKI IVS PQPLSSDHVKVFLYQ I LRG L KYLHS AG I LHRD 

IK^GNLLVNSNCVLKICDFGLARVEELDESRHMTQEWTQYYRA 

PEILMGSRHYSNAIDIWSVGCIFAELLGRRILFQAQSPIQQLDL 

ITDLLGTPSLEAMRTACEGAKAHIIiRGPHKQPSLPVLYTLSSQA 
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NO: 


beginning 
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locat ion 
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to first 
amino acid 
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nucleotide 
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to first 
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amino acid 
sequence 


Anu.no acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D-Aepartic Acid, E- 
Glutamic Acid, F»Phenylalanine, G^Glycine, 
n-niuLimnt, j.«=Asoxeucine / Ksuysine, 
L-Leucine, M«Methionine, N=Asparagine, 
P=>Proline, Q^Glutamine, R«Arginine, 
S=Se rine, ^Threonine, V»Valine, 
^Tryptophan, Y=Tyrosine, X=Wiknown, *=>Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAVH LLCRML VFDP YKR ISAKDALAHPYLDEGRLRYHT?CMCK 
v-v,r ;>ii>i<oKV 1 1 z>ut fcit'v INFKrUDTFEKNLSSVRQVKEIIHQF 
ILEQQKGNRVPLCrNPQSAAFKSFISSTVAQPSEMPPSPLVWB j 


5636 


125 


1155 


DRKMSELDQLRQ2AEQLKNQIRDARKACADATLSQITNWIDPVG " 
RIQMRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 
TTNKVHAIPLRSS WVMTCAYAPSGNYVACGGLDNICS IYNLKTR 
EGNVRVSRELAGHTGYLS CCR PLDDNQIVTS SGDTTCALWD IET 
G0XJ7TTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVRBGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
KTYSHDNI I CGI TS VS FSKSGRLLLAGYT)DFNCNVWDALKADRA 
GVLAGHDNRVS CLGVTDDGMAVATGS WDSFLKIWN 


5639 


125. 


1155 


DRKMSELDQLRQEAEQLKNQIRDARkACADATLSQITNNIDPVG 
R IQMRTRRTLRGHLAK I YAMHWGTDSRI*LVSASQDGKLI I WDS Y 
TTNKVHAIPLRSSWVNiTCAYAPSGNYVACGGLDNICSIYNLKTR 
EGNVRVSRELAGHTGYLS CCRFLDDNQI VTS SGDTTCALWD I ET 
GQQTTTFTGHTGDVMS LS LAPDTRL F VSGACDAS AKLWD VREGM 
CRQTFTGHESDINAICFFPNGNAFATGS DDATCRL FDLRADQEIi 
MTYSHDNI ICGITSVSFS KSGRLLLAG YDDFNCNVWDAL KADRA 
GVLAGHDNRVSCLG VTDDG KA V ATGS WDS FLKIWN 


5640 

i 


2 BO 


1092 


QQGNKKTMLSHNTMMKQRKQQATAIMKSVHGNDVDGMDLGKIW'S 
IPRDIMLEELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHSIAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWBQAISNDPELLBALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELLL 
LTD PR FMS FVNP LSGRR S FNRTPKGWI S ENI P IVITTBPTDDTT 
VPESEDL 


5641 


27 


332 


CKHNCNGDVKLLSNQMDKLFAFHLFTFHGLLHFLDGSIQKLIQA 

EIILSDNSSILVLENNFLFKVECSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 


199 


1247 

/ 


ITPCRMDFLVLFLFYLASVLMGLVLICVCSKTHSLKGLARGGAQ 
IFSCI rPECLQRAKHGLLHYLraTRNHTFIVLHLVLQGMVYTEY 
TWE VFGYCQEL ELS LH YLLLP YLLLG VNLFF FTLTCGTNPG I IT 
KANBLLFLHVYE FDEVMFPKWRCSTCDLRKPAF^KHCSVCNWC 
VHR FO H HCVWVNKTC I GAWN I R YFL I YVLTLTASAATVAX VS TT F 
LVHLWMS DL YQET YIDDLGHLHVMDTVFLIQYLFLTFPR I VFM 
LGFVWLS FLLGG YLLFVLY LAATNQTTNEWYRGDWAWCQR CPL 
VAWPPSAEPQVHRNIHSHGLRSNLQBIFLPAFPCHERKKQE 


5643 


1 


847 


PSGG VRDVETRGPGSRAARG PRWMERRGVGAGA IAKKKXAEAK 
YKERGTVLAEDQLAQMS KQLDMFKTNLEE FASKHKQE IRKNPEF 
RVQ FQDMCATIG VD PLASGKGFWSEMLG VGDFYYELGVQ HE VC 
LALKHRNGGLITLEELHQQVUCGRGKFAQDVSQDDLI R A I KKLK 
ALGTGFGI I PVGGT YLI QS VPAELNMDHT WLQLAE KNGYVTVS 
LuvH^ii^ftx i c»KMJ<y v baHixuiUSLuuAyf i^liyAPGEAHYWLPALF 
TDLYS QE ITAEEAREALP 


5644 


83 


113 8 


* i\iu'won »wuAio vovyyimfisfn Vrt\jyj*yi5KJVj<£".niEVlI£YFQ 
KfCVSPVHLKILLTSDEAWKRFVRVAELPREEADALYEALKNLTP 
YVAIEDKDMQQKEQQ FR EWFLK EFPQ I RWK IQES I BRLRVIANE 
IEKVHRGCVIANWSGSTGILSVIGVMLAPFTAGLSLSITAAGV 

uiivyxnOAi/iuirwoi v cJSi l IKoiULJj J^AoKLiTATSTDQLEALRD 

I LHDI T PNVLS FALD FDEATKM IANDVHTLRRSKATVGR PLIAW 
R YVP I NWE T LRTR GAPTR I VR XVARNLG KATS G VL WLD WNL 
VQDSLDimGEK^ESAELLRQWAQELEENLNELTHIHOSLKAG 


5645 


537 


799 


VQSVRDLKRLSPTDPPGDSGNRDVTREDPVTGPLNSASS^QVPTL" 
YLCLQNSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


564 6 


3745 


3328 


AEQYGTS PHLLPTMLLS SCLPPANVXTKAATPPPLVLS LTTADP 
AGKPAPCRVTLTLLRAS I PATKRASFLS S FIKMFFEELEYTLGF 
LSLLKFHVHVS VYS A I CHFQKEGTGNSRS FTCTPELFP RLQTHL 
RAEGGAQ 


5647 


288 


800 


GVIMATSELSCEVSEEMCERREAFWAEWKDLTLSTRPSEGCSLH 
EEDTQRHETYHQQGQCQVLVQRS PWLMMRMGI LGRGLQE YQLP Y | 
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ID 
NO: 


Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


«^ia segment containing signal peptide 
(A=Alanine, C=Cysteine. D=Aspartic Acid, e» 

Glut 3 mi c Acid, FaPhenvlalnnina n_D1«Mi 1 

*■ ticuyid j, tixi i iic , vi°vsiycxne, 1 
HaHistidine, I-Isoleucine, KoLysine, 1 
L=Leucine, M^Methionine, N=Asparagine, 
P«Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *eStop 
Codon, /^possible nucleotide deletion, * ! 
\epossible nucleotide inserbion) i 








UK VLPLP I FTPAKMGATKEEREDTP IQLQELIALBTALGGQCVD j 
_ RQE VAEI T KQLPP WP VSK PG ALRRS LS R5 M50E AORG \ 




7 


1518 


VLS3LCGRHEAI*REVGAEWPPPTCSPKICSGLQQAGNTDWSiTM 
APQ3 LPS S RMAPLGMLIX3LLMAACFTF CLS HQNLKE FALTN PEK 
SSTXETERKicrTCAPPPT.na , PTrr.i?\ftrtJt>n^reLT/\i»T tit, 1 

>^v» * -^i— * x c^n n*d J5iiii//\fl ViiJiVrnrJ rU£ WQALQPGQAVPAGS 1 

HVRLNIOTGEREAKLQYEDKFRNNLKGKRXiD intntytsqdlks 
AIiAKFKEGAEMBSSKEDKARQAEVKRLFRPIEELKKDFDELNVV 
IBTDMQIMVRLINKFNSSSSSLEEKIAAGFDLEyyVHQMDNAQD 
LLS FGGLQ WI NGLNS TE PLVKEYAAFVLG AAFSSNP KVQVEAI 
BGGALQKLL VI LATEQPLTAKKKVL FALCS LLRHFP YAQRQFLK 
LGGLQWiTLVQBKGTEVliAVRVVTI^YDLVTEKMFABEEAELT 
QEMSPEK1^QYRQVHLLPGLME(^WCEITVHLLALPEHDAREKV 

LQTLGVLLTTCRDRYRQDPQLGRTLASLQAEYQVLASLELQDGE 
DBGYFQEliLGSVNSLIiKELR 


5649 


1172 


3006 


KLQEQLDAllJEEIRMIQEEKESTELRAEEIETRVTSGS^EAl^iH 
KQLRKRGSI PTSLTDLSLASAS P PLS GRSTP KLTSRSAAQDLDR 
MGVMTL PSDLRKHRR KLLS P VS REENREDKAT I KCETSP P S S PR 
TLR LE KLGH PALS QE EG KS ALE DQGS NPSS S NSSGjDSLHKGAKR 
paj x tu> :> i FGKKE KGRL I QLSRDGATGH VLLTDSEFSMQE PM 
VP AKLGTQAEKDRRLKKKHQLLEDARR KEMP FAQWDGPTWSWL 
ELVrVGMPAWYVAACRANVKSGAlMSALSDTEIQREIGISNALHR 
LKLRLAIQEMVS LTSPSAPPTSRTS SGNVWVTHEEMETLETSTK 
TD S BEGS WAQTLAYGDMNHE MIGNE WLPSLGLPQYRS YFME CLV 
DAHMLDHLTKKDLRVHLKIWDSFHRTS^YGIMCLKRLNYDRKE 
LEKRREBSQHE I KDVLVWTNDQWHlr/VQS IGLRDYAGNLHESGV 
H3 ALLALD ENFDHNTLAL ILQ I PTQNTQARQVMERE FNNLLALG 
TOR KLDDGDD K VFRRAP S WRKRFR PREHHG RGG MLSAS AETL P A 
GFR VS TLGTLQP P PAP PKKIMP EAHS H YLYGHMLS AFRD | 


5650 
5651 


1172 


3006 


mlqeqldaineeirmiqeekjestelraeeietrvtsgsmeaTOiH 

KQLRKRGS I PTSLTDLSLASASPPLS GRSTP KLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCBTSPPSSPR 
TLRLEKIX3HPALSQESGKSALEDQGSNPSSSNSSQDSLHKGAKR 

j. tuj j, vjnjjr ixiv^c. jujkjli Ji si Jjo KlXiA TGH VXiLTD S E FSMQE PM 1 
V? AKLGTQ AE KDRRL KKKHQLLEDARR KGM P FAQWDGPTWSWL 
ELWGMPAWYVAACRANVKSGAIMSALSDTEIQRBIGISNALHR 
LKLRLAIQEMVSLTS PS AP PTSRTSSGNVWVTHE BMETLBTSTK 
TDSEEGS WAQTLA YGDMNHE W IGNEWLPS LGLPQ YRS YFMECLV 
DARMLDHLTKKDLRVHLKMVDS FHRTS LQ YG IMCLKRLN YDRKE 
LE KRREESQHEI KDVLVWTNDQWHWVQS IGLRDYAGNLHESGV 
HGALLALDENFDHKTLALILQI P TQNTQARQVMEREF3NNLLALG 
TDRKLDDGDD KVFRRAPS WRKRFR PREHHGRGGMLSASAETLPA 
GFRV3TLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


' 56-52 


646 


1869 


ARQGQRQPWG*EARAKGt'A6F<Pl?V*KrgrMTrrDagD»>nnr.omT — 

AWGEGAG I R* ASGLTAAGAAS AAAA/ P P PTRGG PAPAG CGRAP P 

WPAPLRVPTHGRAPAPRSRAAPRAPALSHGTAAAALSPASPAGP 
ADP*LPGHS SQS PPRG *RWGRS RSAPAPAHPPH PAParQa caen 

QTPG W PGS CCLAQ3WQAE PLGAPGAE DG \ P VP PQ RG FPLGTLGS 
PAGS WAGLAG YG* AGAPGTQATAPRAAGQT P VAAAPNCR V*GS A 
PALHRAPAAADPGSPLQAPPRAWAS PAAAG PGLSSSDYCGGLGA 
GWRAGISPBLLGAAGLSDNWARCPGPGPAB^GGQPGCRTIPASA 
CMPSPPVEGSLGLSRKGHGDLPSQAR*GWHECRRARHLVPLPRL 
LGPRGRTGRPSSPS | 


5653 ■ 


735 


343 - 


HHKKYQHIHQKS FSCPEPACGKS FNFKKHLKEHMKLHSDTRDYI 
CE PCARS FRTS SNLVIHRR I HTGEKPLQCE I CGFTCRQKAS LNW 
HQRKHAETVAALRFPCEFCG KRFEKPD S VAAH RSKSHPALLLA | 




66 


1401 

< 


RGRLQSRGRLTLGLVLLLLD I LGARQHGQRVS HGWKGG FLTAPL 
^PQPCQPGTRRGRRRSLKEATBPQLAMAEEFVTLKDVGMDFTL 
3DWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 
[iBPLAGGSPEATSPDVTETKNSPLMEDFFEEGFSQBI /SRDVIQ 
3WLLBLQFRRSLYRGHLVR * FARRS RKS SEV * YCHQRGKS HGMQ 
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ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 

ami nr\ a 

residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, DcAspartiC Acid, E« 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P»Proline, Q=Glutamine, -R=Arginine, 
S=Serine, T=*hreonine , V.Valine, 
^Tryptophan, Y-Tyroaine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ES * J-KKRTysCVHRFHGRRFHG \DNVSEKTIiTPAKSKEyRGEFP" ~ 
S YSDHS QQDS VQEGEKP YQCS B CGKS FSGS YRLTQHW I THTRE K 
PTVHQECEQGPDRKASHSGYPKTHTGYKPYVCNEYGTPPSQSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 

ECGKAFTRIFHLTRHQKIHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV 


5554 


3 


598 


TLPLFPGRRrRGWRRCGAVAARKNSTGGNVSINQRRDSVRMSAL 
NWKP FVYGGLAS ITAE CGTFP I DLTKTR PQ I QGQTNDAKFKEI I 
YRGMLHALVRIGREEGLKALYSG*VGLHAPLCHCSLPHMGIDFR 
PRLHRSQVKSLRCV*KEQIA**/MFSLLISTLISKYIYYAADVL 
EKLFYYIQVQTDNNKKICLPKNI 


5655 


2 


867 


RPPG I RAPRQLHPAAGRRPDASARPRFR PTVLLHDPFQ I»S FPP P 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEM I PFKDEGDPQ \RE KI FAE I VJJPEEEGDLADI KS S LVNES 
EI I PASNGHE VARQAQTS Q E PYHDKAREH PDDGKHPDGGLYNKG 
PS YS S YSG Y IMMPNMNND P YMSNGSLS P P I PRTSNKVP WQP S H 
AVHPLTPLITYSDEHFSPGSHPSKIPSDVNSKQGMSRHPPAPDI 
PTFYPLSPGGGGQITPPLGWQGQP 


5656 


228 


1066 


PRRVP PliPE FASGPGAAFFHSGRLQRS LiTKDSAGCFSQCRS RAM "" 

LVLRSOIiTKALASRTLAPQVCSSFATGPROYDGTFYEFRTYYLK 

PSNMNAFMENLBCKNIHLRTSYSELVGFWSVEFGGRTNKVFHIWK 

YDNFPKRAEVRKALANCKEWQEQSIIPNIARIDKQETE1TYLIP 

WSKLQKPPKEGVYELAVFQMKPGGPALWGDAFERAINAHVNLGY 

TKWGVPHTEYGELNRVHVLWWNESADSRAAVRHXSHEDPISHG 

GVRESVNYL\VSQQNM 


5657 


105 


1052 


GQRLQSPRVQMPVQPPSKDTEEMEAEGDSAABMNGEEEBSEEER " 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDIiEKQFSELKEKL 
FRERLS QLRIiRLEE VGAERAPEYTEPLGGLQRSIiK I RIQVAG I Y 
KGFCLDVIRNKYECELQGAKQHLESE KLI1I1YDTLQGELQER I QR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
PYIVYMLQEIDILEDWTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQ SS WHCTQDSRLPPADRRTHR PLRVCPARLLWCCWALPLH L 
ALVWTPPL 


565 S 


2346 


3541 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NQLLLALLK<nX»TELQLRRDAIFCQALVAAVCTFSEQLIAALGY 
R YNNNGE YBESS RDASRKWLEQVAATG VLLHCQ SLLS PAT VKE E 
RTMLED I WVTLSELDNVTFS FKQLDEN Y VANTNVFYH IEGS RQA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLE.WEGLPS 
PGSOAAEDLQQD 1 NAQSLE KVQQY YRKLRAFYL ERSNLPTDAST 
TAVKIDQLlRPINAIiDEIiCRLMKSFVHPKPGAAGSVGAGLIPIS 
SELCYRLGACQMVMCGTGMQRSTIiSVSLEQAAILARSHGLLPKC 
I MQATD I MRKQG PRVEILAKNLRVKDQMPQGAPRLYRLCQPKMN 
GDL 


5659 


2 


696 


WKRSGBVSPKGELGAWRGNSGRPKiiGRAAJ^NEDRTt^RLLP 
GNERS QPRS PLRLIAPQLKAEAAADKGIAP VPPPFS SGHSGPC \ 
EREGEGQRGRGRSRRGAHLELXPS PGLRAGAPTDRGRGG PAE VA 
AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 
EDNPAGAGG\ AAVAG AAGGARRFLCG WEG PYGRP WVM3QRKEL 
FRRLQKWEIiNTYI* 


5660 " 


229 


853 


PVTMWAFS ELPMPLLINLIVSLLGFVATVTLIPAFRGH FIAARL 
CGQDLNKTSRQO IP E SOGVI SGAVFLT I O t?d v t urmnn? 
QRKAFPHHEFVALIGALLAICCMI FLG FADDVLNLRWRHKLLLP 
TAASLPLLMWFTNFGNTTIVVPKPFRPILGLHLDLGR*SYHCC 
PYGTYFREPFLVLHILLQVFLFCLCVFPDPFW 


5661 


2 


473 


LWLYPSPCGGIPKLPGLPREAAAALGASFLAEAPLPVTVRGSGL 
AGMAVTCDPKAFLSICFVTLVFLQLPLASICQN*GTDSCASRGK 
ADFDVTG PHAP ILAMAGGHVE LQCQL FPNI SAEDMBLRWYRCQP 
SLAVHMHERGMDMDGEQKWQYRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRS VRFCSSA • 
PFPKHKPSAKLSVRDALGAQNASGERIKIQGWIRSVRSOKEVLP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D=Aspartic Acid, Eo 
Glutamic Acid, Fc Phenyl alanine, G-Glycine. 
K=Histidine, I=lsoleucine, K=Lysine, 
L^Leucine, M-Methionine, N=Asparagine, 
P- Proline, G=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W -Tryptophan, Y-Tyrosine, X-Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=*po3sible nucleotide insertion) 








LHVNDGSSLESLQWADSGLDSRELTFGSSVEVQGQLIKSPSKR 
QNVELKAEKIKVIGNCDAKDPP I KYKERHPLEYLRQYPHFRCRT 
NVLGS I LRIRSEATAA1HSFFKDSGFVH IHTP I ITSNDSEGAGE 
LPQLEPSGKLKVPEENPFNVPAPLTVSGQLHIjEVMSGAFTQVFT 
FGPTFRAENSQSRRHLAEFYMIEAEISFVDSLQDLMQVIEELFK 
ATTMrfVLSKCPEDVELCHKFIAPGQKDRL*HMLKNNPLIISYTE 
AVE ILKQASQNFTFTPEWGADLRTEHEKYLVKHCGNI PVFVTNY 
PLTLKPFYMRBNEDGPQELEGSVA*HSLGLMILLSIWIGQP 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLkGASG'CJPGAteRSLL 
VQSYFEKGPLTFRDVAIEPSLEEWQCLDSAQQGLYRKVMLENYR 
NLVFLG IALTKPDLITCLEQGKEP WNXKRHEMVAKPPVI CSIIFP 
QOLVJAEQDI KDS FQEAI LKKYGKYGHANPQLQKG CKS VD E CKVH 
KEHDNKLNQCLIPKKKK 




118 


572 


SLSMESNHKSGDGLSGTQKSAALRALVQRTGYSLVQENGQRKYG 
GPP PGWDAAP PERGCE I Fl G KIiP R DLFEDELI PLCEKIGKI YEM 
RMMMD FNGNNRG YAFVTFSNKVEAKNAI KQLNNYE I RNGRLLGV 
CAS VDNCRLFVGG I PKTTKK 


5665 


347 


702 


WQHLIILLHCERTSPAMITSELPVLQDSTNETTAHSDAGSELE 
ETBVKGKRKRGRPGRPPSTNKKPRKSPGBKSRIEAGIRGAGRGR 
ANGHPQQNGEGEPVTLFEWKLGKSAMQRC 


5666 


213 


540 


VSCLPTSCKMI TliNNQDQPVPFNSSHPDE YK1 AALVFYSCI Fit 
GLFVNITALWVFSCTTKKRTTVTIYMMNVALVDLIPIMTLPFRM 
FY YAKDEWPFGEYFCQI LGA 


5667 


1 


695 


HPLPSASLGLPSVSLGVSLCVRSALLEAWPHLPKRRRARVGSP - 
SGDAASSXPPSTRFPGVAlYIiVEPRMGRSRRAFLTGLARSKGFR 
VLDACSSEATHVVMEETSAEEAVSWQBRRMAAAPPGCTPPALLD 
IS WLTES LGAGQPVPVE CRHRLE VAGP S KGPLS PAWMPAYACQR 
PTPLTHHNTGLS EALBI LAEAAGFBGSEGRLLTFCRAAS VLKAL 
PSPVTTLSQLQ 


566B 


691 


894 


csflfcipdlflqfllgrkekeavlvggewspsLdgLdpqadpq " 
vlvrtai rcaqaqtg i dlsg ctkw 




407 


1 


DSGAPEGLSPLM3TQEGLSMHAHPQAYTPFIYliHARKRRGEIGD ~ 
ADSRFNDR YAHKSAQL YFL Y FVCW I FQDVY Y PTI KEKNHFFFPK 

ARGAPTKYSGSPIGS PTTTPPTRPPS FNLHPAPHLLASMQLQKL 
NSQ 


5670 


J 


373 


SSECLTMAW I PLLLPLLI LCTVSVASYELAQPSS VSV^PGCJTAK 
I TCSGDVLAKKYARWFQQ KPGQAPVLVI YKDTBRPSGI PERFSG 
S TSGTTVTLT ISGAQVEDBAD YFCYS ATDNFLWVF 


5671 


280 


524 


KFPPKKTPPHLGMESAITLWQFliLQLLLDQKHBHLICWTSNDGE " 
FKLLKAKKVAiO^WGLRF^TimNmKLSRALRI.LFOT 


5672 


2 


557 


fvpatpdpgvwlppsrdpamakrssi.yirivegknlpakditgS 
s dpyci vkvdne p 1 1 rtatvwktlcp fwgeeyqvhlp p tfhava 

FYVMDEDALSRDDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 
NPPSHSETSPLGSVWSPAQGKPFLLSPEAGATFCTPGLCSAACS 
OAWLLLPLP 


S6?3 


327 


696 


ITVADQ I SHWSAGR I KNRTRI PECIHSSAATTIiAGPHTMEGESV " 
KLSSQTL I QAGDDEKNQRTITVNPAHMGKAFKVMNELRSKQLLC 
DVMIVAEDVBIEAHRVVLAACSPYFCAMFTGDMS 


**74 


17 


984 


GGGSMBGESTSAVLSGFVLGALAFQHLNTDSDTEGFLLGSVKGE " 
AKNSITDSQMDDVE WYTIDI Q KYI PCYQL PS FYNS SGEVNEQA 
LKKILSNVKKNVVGWYKFRRHSDOIMTFRRRr.T.H IOJT .OFW 7?<s wo 
DLVPIjLLTPS I ITES CSTHRLEHSLYKPQKGLFHRVPLWANLG 
MSEQLGYKTVSGSCMSTCFSRAVQTHSSKFFEEDGSLKBVHKIN 
EM YASLQEEIjKS I CKKVEDSEQAVDKLVKDVNRLKRE I EKRRGA 
QIQAARF^IQKDPQFJ^FLCC^LRTFFPNSEFLHSCVMSLKID 
MFLKVAVTTTTISM 


5*7* " 


80 


753 


EGSRRGPTR LARLSARAGRLHFP PGFS SRLIHFRGVSECRRPPG" 
KSGVPVSAPGSDGKWWBERPGMFSLMAS CCGWF KRWREP VRKVT 
IiLMVGIJ?NAGKTATAKG IQGE YPEDVAPTVGFS KItihRQGKFEV 
TIFDim5IRIRGIWKNYYABSYGVIFVV13SSDEERMEETKBAM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first: 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, c=Cysteine, D«Aspartic Acid, B= 
Glutamic Acid. ^Phenylalanine, G=Glycine, 
HaHistidinei Iislsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S-Serine, T»Threonine, V»Valine, 
W-Tryptophan, Y=Tyxosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion) 








semlrhprisgkfilvu«ikqdkegai*geadvieclsle!3a^e^ 

HKCL | 


5676 


2 


930 


fvssppprpvqparpggfglsgrrsulcovastpahvgvmrspv 
rdlarndgeestdrtpllpgapraeaapvccsarynlailapfg 
ffivyalrvnlsvalvdmvdsnttlednrtsxacpehsapikvh 
hnqtg kkyqvtd aetqgw ilgsf fygyi itq i ?gg y vas kiggkm 

llgfgilgtavltlftpiaadlgvgplivlraleglgegvtfpa 

MHAMWSS WAP PLERS KLLS I S YAGAQLGTVI SLPLSGI I C YYMN 

WTYWYFFGTIGIFWFLLWIWX^SDTPQXHKRISHYEKEYILSS 
L 


5677 


1 


1028 


PPRDGFljELRRLSVPLCSGPCPLTSLSRQGERSGGHIiVAAARAA 
VTAETHPLPLLAPLAVCQS VKS PAACCfVRPR PRAVALPAALGG P 
GRSLPGLTAATMSSFSESALEKKLSELSNSQQSVQTLSLNLIHH 
RKHAGPIVSVWHRELRKAKSNRKLTFLYLANDVIQNSKPJCGPBF 
TREFESVL\T)AFSHVAREADEGCKKPLERLIJTIWQERSVYGGBF 
I QQL XLSM ED S KS F P P KATEEKXSLKRTFQQ I QEEEDDDYPGS Y • 
S PQDPSAG PLLTEEL I KALQDLENAASGDATVRQKI AS LPQEVQ 
DVS LZjEKI TDKEAAERLS KTVDEACLRNRG PGTS 


5678 


3 


593 


SSSPPSSTPSLPLPFYbliLGQLRLQLLWGTAHLSGAGEAAPCPG - 
GSGRTAAPRTRADPAAQSLMlPUfKMKNFKRRFSLSVPRTETIEE 
SLAE FTEQFNQLHNRRNENLQLGPLGRD PPQECSTFS PTDSGEE 
PGQLS PGVQFQRRO>IQRRFSMEVRAS GALPRQ VAGCTHKGVHRR 
AAALQPDFDVSKRLSLPMD I 


5679 


2 


423 


LNS RVDDFVAVPGAIMDEDYYGSAAE WGDEADGGQQEDDSGEGE 
DDAEVQQECLHKFSTRDYIMEPS I FNTLKRYFQAGGS PENVIQL 
LSENYTAVAQTVNLLAE WL IQTGVEPVQVQETVENHLKSLLI KH 
FDPRKADSIFTEEGETPAWLEQMIAHTTWRDLFYECLAEAHPDCL 
MLNFTVK^GRVLELRRKVTMNVYFWLLVCFL 


5680 


258 


592 


RRLTST3EKLQNRNSHTPLESLIHPQPSYKGFGIMFGKKKKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSLIiADTANRPKPMV 
DPSCITPIQLAPMKTIVRGNKPC 


5681 


45 


869 


LLCAKTLGVRTKESQAEG YNRSGINNHQAEDPRFCPSFCWMRSA ' " 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWKFLPLVFTL 
FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PAS CVFS QVMNMAAFLALVVAVI.K KIQLKP KVLNPWLNI SGLVA 
LCLAS FGJTTLLGNFQLTNDEEIHNVGTSLT FGFGTLTC W I QAAL 
TLKVN I KNEGRR VGI PRVI LS AS I TLCVGPLLHPHGPKHPHVCS 
QSPVGPGHVL 


5*82 


39 


622 


PSRSCLGIMRKWRHREVNLPEVTQQDAVCPAPIPSPGLSAQTGL"" 
QKIWGTIHCQVCPGAPAWPGSPWHEEWGI*LLLVPLLLLPGSYGL 
PFYNGFYYSNSAinXlNIjGNGHG^LIiNGVKLVVETPEETLFTYQ 
GAS V I LPCRYRYEPALVS PRRVR VKW WKLSENGAPEKDVLVAI G 
LRHRS FGDYQGRVHLRQD 


5683 


89 


778 


GSCGATALI TRCLAWS VL IS RLAMATYTCITCRVAFRDADMQRA 
HYKTDWHRYNLRRKVASMAPVTAEGFQERVRAQRAVAEEESKGS 
ATYCTVCSKKFASFNAYENHLKSRRHVELEKKAVQAVNRKVEKM 
KEKNLEKG LGVDS VD KD AMNAAI QQAI KAQPSMS PKKAPPAPAK 
EARNVVAVGTGGRGTHDRDPSEKPPRLQWFEQQAKKLAKHSEDD 
SEDEEHDLC 


5684 




677 


TWCFRGYLGPRVIMXALDEPPYLTVdVbVSAJCYRGAFCEAKIKT " 
J\x\KL. vjs.vi\V 1 r KliDbi> 1 V is VQODHI KG P L KVGA I VE VKNLDGAY 
Q3AVINKLTDASWYTWFDDGDEKTLRRS3LCLKGERHFABSET 
liDQLPLTNPEHFGTPVIGKKTNRGRRYE 


5685 


779 


1262 


LIiLQQPVVHCFLLFPPFRFSHHMIPGPPGPHTTGIPHPAlVTPQ 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKES AAINQ I LGRRWHALS REEQAKYYELARKE 
RQLHMQLYPGWSARDNYVSPSS IP7ALHS 


5686 


128 


1181 


CTWWQVNITIiLD INDNHPTWKDAPYYINLVEOTPPDSDVTTVVA 
VDPDLGENGTLVYSI QPPNKFYS W3STTGKIRTTHAMLDRENPD 
PHEAELMRKIWSVTDCGRPPLKATSSATVFVNriLDIiNDNDPTF 
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SEQ 
ID 
NO; 


Predicted 

beginning 

nucleotide 

location 

corresponding 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
<A=Alanine, C=Cysteine, D«Asparfcic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L»Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T»Threonine, VnValine, 
WaTryptophan, Y=Tyrosine, X* Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








^Lpfvaevi^gipagvsiyqwaidldegijjglvsyrmpVgmp 

RMDFLINSSSGVWTTTELDRBRIAEYQLRWASDAGTPTKSST 
STLTIHVLDVNDETPTPPPAVYNVSVSEDVPR\GSGWSG*AARN 
ITOVGLNAELSYFITGGNVDGKPSVGYRDAVVRTVVGLDRETTAA 
YMLILBAIDNGPVGKRHTX3TATVFVTVLDVNDKRPIIL0SSYV 


5687 


17 


917 


AAPPAPPDG/PPP/PPPAPPT/PGPAA/APASSCQPRLSAGRAA" 
QGDGGAAAVGHVLW PAVG PVRVNPGLQTP VPRPELLPG P\SSS 
LHSDSSYPPDAGLSDDEBPPDASLPPDPPPLTVp/ADA/PMPVT 
SGCRMPSTSASE/ AAGGQGACTHAKGS ETPPPASPQTSEPAPSP 
LPPHLTGGPGMYSSEAKLPNSFSCLGLAGTGAGI*GTASAHGTG 
PPVLPHVCTPSliANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 
SSPFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGQGP 
VLDI 


5686 


1 


420 


LTKWDL FG& (J Y KLLKTG I EHGAM P EQVG VYW YS / C L YDSRKLFF 
*SHMIIRSLL*KVTDDSLGQLPLLRELLL**LWIDRCIILAYV 
LRVEKTFAI TYL KNFTVKVDFSLLGE I PLISMAAI LKLWIMKID 
DGYIPAVF 


5689 


1504 


3 


HELSG KH I SM VS GNTCN WH PGGHS PGGGGQGE ITS KDRGB I PAL 
IWA/RK?IGTWTATECPTHRAG*GGABEYQPPFQPCEGPRSTSRG 
GEG*GHAVGPGREIGKBGSLPFLGPKALGF*SASCQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 
PGEHRPSG\SPLPACPPRAWPKAGAVASATGTG\PQLPGSRGKQ 
KLPRTRE PPLLOAGWAVRKP PWSEAKEGLGQAGR PSGMDS S AS \ 
PQTPGGRGSLEWGLPL¥LGPHHDVK*RSDRLG* PP* ggqggggh 
GAPSTPGPGGEAW*OPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 
EL*RVPPGSLGPSTQCKYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPliDAGRl>WPGAPSASSSHR*GG*ERARAGAGHRGST*A 
SSK1 EQGRPRPGPTSDALADVEGGAES /GPHPWPLPGTLPNR/ P 
GSPPPA*ASAGRKGTVSTLGGGLL 


5690 


1424 


58 


PSPPAGVCAAPAPLPIJLAlJU^RRPCSPGAEAAPMQTGGPAID 
GAWRTS VS ALRRG ATG/ APCS PGAEAAP WQTGG PAI DG\DGELP 
^VRSEEAPRGOGAEGGGPGSGPVRRPGAGRGAHAGOGRQQDPEP 
DGLRHRQHGAASHARHRLQRLRPGHHQNRHVRRDPQAPPGGPAP 
GHAAALPERTRGVAEPPAWAHAGSDAWRAGR * SQRT * ERAR PRH 
PTFQGRAGS\GQPGYQPPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRSDRNPSQGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
SLSLLGP/PGAHNLDTAPQDR*HGP*GDKRGAPGVAGEDPRPP* 
GNFVR* LLLMP/ GVA* RHGTS PFLGPSLGBNGGQWDSGNLFGTP 
KG * SHPAFTRST * SMEAEKS YWNHPHR\ DRGRQG VRINCLRVGE 
SBMWGP YSAPRPGTVFIiSSFLS PASEEH\ PEGSSSFNTPFPPAG 
PEGDPGLNS PGLLP 


5691 


107 


550 


ISNDPSPGYNIEQMAKRGKKLVEliPYTVKGMDVSFSG"LSFlED * 
VAHRMJLATGHCTPEDLCFSLQVMQ*KTGTESWG*RFYIVEQN*S 
GDAPLIFSPYLSLTGNCGFAMLVEITERAMAH\CGSPGGPSLWG 
GVGVYVLLESVPLSYS 


5*92 - 


1193 


543 


TQAWTRAEKDRKGSVRAtiRLHLERGPPT*RGSHPL\QSVPCIQK 
PSIFSSYPI/GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPL 
TSRSVPPGRGAtiPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 
RLNLPVMGATRSNLQPPRKVAV PG PT R * RDQDSKQDFS S KPLQS 
VPGLASTQQTLTPADSGPGTGGRDATRAGLPGVETMGNGVD 


5693 


1258 


1*30 


ALT WP VRKGTTWWAOPlIGCSNT , V QRftPT.mQQOD camtp -imn — ' 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PEArVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF 


5694 


3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLkdJ*WT " 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEBKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSS KGGELKKPIS LGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSS SDAGRDRL9DAKXPPSG 
IARPS TSGS FG YKKP PPATGTATVMQTGG S ATLS KI QKSSG I P V 
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SEQ 
[ ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, c=Cysteine, D=Aspartic Acid. E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KaLysine, 
L»Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q=Glucaraine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X=Onknown, * B stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KP VNGRKTS LDVSNSAEPG FLAPGARSN IQ Y RS LPRPAKSS SMS ' 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAP VNQTDREKEKAKAXAVALDSDNI S LKS IGSPESTPKNQASH 
PTATKLABL PPTPLRATAKS FVKP PSLANLDKVNSNS LDLP S S S 
DTTQCI 


5695 


3 


1336 


GS KE PARS LHRRGSGH KS SAGKWGS VTLSTAGALG * KQLHQ * WT 
QRCIi\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVI*RTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKP IS LGHPGSLKKG KTPPVAVTS P I THTAQSAL 
KVAGKPEGKATDKGKLAVKMTGLQRSSSDAGRDRLSDAKKPPSG 
I ARPSTSGSFG YKKP PPATGTATVMQTGG SATL3 KIQKS SG IP V 
KPVNGRXTSLDVSKSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRP V3 SS I DPSLLSTKQGGLT PSRLKE PTKVASGRTT 
PAP VNQTDRE KEKAKAKAVALDS DN ISLKS IGS PESTPKNQASH 
PTATKLAELPPTPLRATAKSFVKP PSLANLDKVNSNS LDLPSS S 
DTTQCI 


5696 


3 


! 133d 


GSKE PARSLHR RGSGHKS SAGKWGS VTLSTAGALG *KQLHQ* WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSBKR 
SLABSGLSWFSESBEKAPKKLEYDSGSIiKMEPGTSKWRRERPES 
CDDSSKGGBLKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAG XPEGKATDKGKLAVKNTGLQRS S SDAGRD RLSDAKKP P SG 
IARPSTSGS FGYKKPPPATGTATVWQTGGS ATLSKIQKSSGI PV 
KPVNGRKTS LDVSNSAEPG FLA PGARSNIQYRSLPRPAKSSSMS 
VTGGRGGP R P VS S S ID PS LLSTKQGGLTPS RLKE PTKVASGRTT 
PAP VNQTDREKE KAKAKAVALDSDNI S LKS IGS PESTPKNQASH 

PTATKIJ^LPPTPLRATAKSFVKPPSIANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


PSEALSPPACPSAPAPRRSIISRLFGTSPATEAAPPPPEPVPAA^ 
QG P AT VQS VBD FVPDDRLDRS FLEDTTP ARDEKKVGAKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
SSEEEAEVAAPTKGPAPAPQQCSEPETKWSS I PAS KPRRGTAPT 
RTAAPPWPGGVSVRTGPEKRSSTRPPAEMEPGKGEQASSSESDP 
EGPIAAQMLS FVMDDPDFES EGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEBAGPKESSEEGK 
EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRBRTAA 


569B 
5<J99 


2 


666 *" 


GAEAAEPQEDLPPLSQSSRFFQEQQKMNXSLGPVSFtoVAVDFT - 
QEE WQQLD PEQ KI T YRD VMLENYSNLVS VGYH I IKPDVISKLEQ 
GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEEinCPgRQTVFI 
ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 
NAS SEYI S SDGRYARMKADECSG CGKSLLHI KLEKTHPGDQAYE 
FNQ 




2 


1448 


RVRQPPGLWVRRTVPAMQCPAGLSR VPGVAG /DPSLPSFRGPRD 
EAAHRGTIQTARHTRKLYVQGPASGPPLPR VSTQVA I *DEKPLA 
RPS /GRTNAPFPQGQKPAGKAAPGPAAAGR VAMR \ PGHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL*RS 
TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGWNHS*HLDPNT 
WTQKWTGE/SPAPGEBG\VAPAPRGPTAEHGHCBI»TTESQYSNN 
VP ILFQNPSGALRSRRTEPAG WVPPTRHK * DDG * TAAPASGG AP 

VSTPTWAGTP/IjNASIjGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLG CR /S MLPAS SGPPPJi PGPR s t »a ar; a wtc a c n t> r»n or t» ?! * 

GWQPRRPGFAGRAALPGPPHPPSS*RELGGLPGPGW*TLDPLPA 

HPAHPPGSAPPWGALGGWAAARASLPWSPSLCLSFPAVTPVAGL 
FPPGRG 


5700 


923 




NGHKGVWE INI Y *RRSNIHKNS KSES HLNQDHS FPP PTPNS ARS 

KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E*CSIASSLIKAILRVSVLSE 


5701 


59 


410 


IFEKICSDTQEFISPEINPQICiSWLiFDKdAK/NHATGKDSLFN 
KWS WKNWLSTCR*MRPG ? YFT P YTKINSK* I K/DANIRCE T VKL 
LEENTGENLHDTGLGNVFLDMTP1CTQPTKQK 
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SEQ — 

ID 

NO: 


fteaici.ea 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspamic Acid, B= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine. ^Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S«Serine, ^Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X-Vnknown, '-Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASBSSASSDGPHPVITPSR 
ASBSSASSDGPHPVITPSRASESSASSDGLHPVTTPSRASESSA 
SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 
PVITPSRASESSASSDGPHPV1TPSWSPGSDVT1*LAEALVTVTN 
IBVrNCSITEIETTTSSIPGASDTDLIPTEGVKASSTSDPPALP 
DSTEAKPHITEVTASABTLSTAGTTESAAJPHATVGTPLPTNSAT 
ERE VTAPGATTLSGALVT VS RNP LEETSALS VETPS YVKVSGAA 
PVS IEAGSAVGKTTSFAGSSASS YSPSEAALKNFTPSETLTMDI 
TTKGPFPTSRDPLPSVPPTTTWSSRGTNSTLAKITTSAKTTMKP 
PTATPTTARTR PTT\ A* VQVKME VS S SCG"* VWL PR KTS LT PEWQ 
KG * CSSSTGNSTPTRLTSRSPYCVSGEANG/ PSAAARHVP YAKR 
GCCP*PGPPPTDCSCVTVI>RGTQKVPMKGSMSKPLTPDVATGPS 
LTSTGVYWGGASPVPRGVLGLTliAHVLCFQKEKT 


""5703 


14 


1117 


HHKDSRSQGLPRTQECARPBLRPLLCPRALWPVTRIiS YRCPWQA " " 
PKAGIGTKAKPSESHLKLHPGWPSLDRQGBPATLGTGTGHCSDS 
R ILRWHP * HTAAR* PRWRRLPSSHRWTRHLGVLRVQDKS * * VSL 
DPSCRPRFLRTC**YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GS WETAPGS * WCPWL*AARWTGWRTASGAS AGLGRAADRPSAWA 
RRVAGLLPGQGLTVRR*H* TAGAPASVRS SQGATRS PAPGGDQ C 

ACGRGPGSC*HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


S704 


23 


562 


GDYBFDSPYWDDISQAAKDLVTRLMEVEODQRITAEEAISHEWI 
SGNAAS DKN I KDG VCAQ I E KNFARA KWKXAVRVTTIjM KRLRAP K 
OSS TAAAQS ASATDTATPGAAGGATAAAASGATSAPEGDAARAA 
KS DNVAPRRP* LPPQPQME VPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


CI AC 


23 


562 


GD YE FDSPYWDDI SQAAKDLVTRLMEVEQDQRITAEEAI SHEWI 
SGNAAS DKN I KDG VCAQ I EKNFARAKWKKAVRVTTLMKRLRAPE 
QS S TAAAQSAS ATDTATPGAAGGATAAAAS GATS APEGDAARAA 
KS DNVAPRRP * LP PQPQMEVP PQPLMAVSPQ PPMEASLQ PLMGE 
SPQP 




1 1 ci 


610 


QLGRFXAQDU'VAIRKVKEVFGTGANRHWI LFTHKED*GGQALD 
DYVANTDNCS LKDIiVRECERRYCAFMNWGSVBEQRQQQAELLiAV 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQBDYRQYQAKVE 
WQVEKHKQELRENESNWAYKALLRVKHLMLLHYE I FVFLLLCS I 
LFFIIFLF 


5707 


23 


609 


GSPAPTPGPRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRk* 
MFAI QPGIAEGGQFLGDP PPGLCQ PELQPDSNSNFMASAKDANE 
NWHGMPGRVBPILRRSSSESPSDNQAFQAPGSPEEGVRSPPEGA 
S I PGAE PEKMGG AGTVCS PLEDNGYASSSLS1DSRS SSPBPACG 
TPRG P GPPDPLL PS VAQA 


5708 


44 


1925 


S FS WE ETI 5 PC FPKMPAE P W WI»S P VSLGAAGWPGQPR P YLDL PA 
QAS VS RPHDRA* GEAVS LS LSSGD VCGHTDGGGAGSD PQAKP KP 
PRCPFTAMPSPRTKQKVRNKVCLLIAIRYSDIPSDVSKAP \GPA 
GNPHDRSSTAA* LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVS PASGGPRKEGRQGSGG * AGGGGP \ ARTKADLPCVGFVCS PP 
LLK*SDS PVKQLPA\SGQGSGAGMPPVGSSDILR PRPTSVSGTG 
RAAG * CS WQPAACCTPRS Q * WAVARS PSRCSRW* RQSGR*RG* S 
S RRRRGP * AAGRSTPAVP * P CS *GGAGRRAYACRTGWGYAPSR* 
LEPSGPTSGSAL* TWASHSTGA+ *SRLCGTAGTGPLCSQSSRS * 
AG*RCCCtAASPCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
G3GRAMGSRCVCTCTGL PCPG I PLSGASPGGSGBTGAGRSHTLK 
AARS RLS PRPG SGS RGS Y* SHNDNWGTWPAPPSAGHLLVGG * NS 
QRTSSDH*YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPASS/GBAPA 
PPPRPEPPPPPARRP 


5709 


2 


2031 


ITLCPLPO/TEKCliNVVTEAATPLGIYLKARVEAGGLKELEISWG 
LHQIWRWGAWMRAGMGG CRCWG VMAPFAPR/NALS FLVNDCS 
L IHN^l/CMAAVFVDRAGE WKLGGLD YM YS AQSNGGG PPRKGI PE 
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SEQ 
ID 
KO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=*Glycine, 
H=Histidine, I=Isoleucine, KoLysine, 
L=Leucine, Methionine, N«Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyroaine, X-Unknown, *»Stop 
Codon, /"possible nucleotide deletion, 
Vpossible nucleotide insertion) 








LEQYDPPEIADSSGRWREKRSADMNRLGCLIWEVFNGPLPRAA 
ALRNPG KI PKTLVPHYCELVGANPKVRPNP ARFLQNCRAPGGFM 
SNRFV5TNLFLEEIQIKEPAEKQKFFQELSKSLDAFPBDFCRHK 
VLPQLLTAFE FGNAGAVVLTPLFKVGKFLSAEBYQQKI I PWVK 
MFSSTDRAMRIRLLQQMEQFIQYLDEPTVNTQIFPHWHGFIiDT 
NPAIREQTVKSMLIjIiAPKLNEANLNVELMKHFARIjQAXDEQGP I 
RCNTTVCLGKIGSYL5ASTRHRVLTSAFSRATRDPFAPSRVAGV 
LGFAATHNLYSI1NDCAQKILPVLCGLTVDPEKSVRDQAFKAIRS. 
FLS KLESVS ED PTQLEE VB KDVHAASS PGMGGAAAS WAGWAVTG 
VSS LTS KL3 RSHPTTAPTETNIPQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DWSTGGQVSRASQVS\TPTTWPPNPQSPTGAAGK\RGLLGTGLA 
GAKLPGATS *RYTAGQRV 


5710 


1 


562 


IPGSTISGEVELMARMAKTIDSFTQNOTRLWIIDGbnACEQDK 
VLQMLDTVRVLFSXGP FI AI FASDPHI I IKAI NQNLNSVPSGFK 
\LNGHD YMRN I VHLPVFLNSRGL/RQ/LQENPS * LQQQMBTFHA 
QILQGYRKKLTEEFHRTALGR*QNLVARQPSIDG+DAIGPELYV 
CIA I QFNTNKDDAT 


5711 


1526 


1130 


RRKPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQIAXAVLSQQRPSLFHECAFHFFS* SLQRHTINLDQGI F+LLM 
LSEERQHLFES S / I WTTPHNLK +/FE IHEHLGSHEGHWTLF FLL 
QIL 


5712 


3 


1331 


GRKLFQSLDI S ERLKFLLTLDC VDDTL I VLAEEHG CUDI I KELP 
ETVIDIJ.NKCLTFHPSKRPTPDELMKDKVFSEVSPLYTPPTKPA 
SLFSSSLRCADLTLPBDISQLCKDINNDYTAERS I EEVYYLWCL 
AGGDLBKELVNKEI IRSKPPI CTLPNFLFBDGES FGQGRDRSS / 
TFR* YHWD I WM PAKK+ 1 ERCWGRS I LP I TLKMTSL I LPYSNFSN 
NELSAAATIiPLIIRBKDTEYQLNRIILFDRLLKAYPYKKNQIWK 
EARVDI P PLMRGLTWAALLG VEGAIHAKYDAIDKDTP I PTDRQ I 
E VDX PRCHQYDELLS S PEGHAKFRRVLKAWVVSH PDLVYWQGL D 
SLCMFLYIiNFNNEALVYACMSAFIPKYLYWFFLKDWSHVIQEY 
LTVFSQMIAFHDPELSNHLNEIGFIPDLYAIPWFLTMFTHVFPL 
HKIFKLW\ DTLLLGEFLFP ILYWB 


5713 


(!34 


284 


PVCAVP VDRWPVL PREDQBGQQL* AKLPRDFRR* FQI LGPMEGH 
TACRCSRRGACVQHLPREDIRAAE*DPHIjREVWPGLPTSSATSP 
♦RAVLTSPCSHLGSADAASSHWLCGVSFH 


5714 


212 


613 


wglglgptmsslgggsqdaggssssstkgsggsgssgpkagaad 
ks awaaaapas vaddt p p pe rrnks g i iseplnkslrrsrpls 
hyssfgssggsgggsmmggesadkataaaaaasllanghdlaaa 

MA 


5715 


131 


1973 


ESASQQKR5KCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 
QTPPAS KliQGGGGGLQTGWGliH? VPVTAAS PLPR WCLFGAVAK\ 
GLPGP*LCPSGAA/GGLQRGPGLSPLGAAGKVSCLHPPSMVENN 
DS TCHEHHEGI LAAR VTPVP \ SGKPGRVLKPPGRVCRPPHPAAS 
PRPPGS/SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEEKTFMSS 
QIRRKETKPL* RKTPAG\NNYQSNSI PVSQSPQLTVDLLPSAGR 
TQAPSGRGDA3KPTPGHG\LPKASVILTPNCPCSLAGGQ*PPGL 
YPKTPKQRRWRRPL/ LLGPSQ *GSRQSTC* EV\GALGEPVRI PG 
L* PDLS CILSNGSKHRREGLS FPRSLGPGRRG PAGLQSLGCS PT 
PKNTACHS SGHVALQAGHDSARDVGSGHVALQAGHDSTQDVGRP 
VWRWI PLE * IiGLSRETGOATR'RRT iVTW T ^ PftR & b & & PVfi nana r i? 

EGPLRXiPGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 
GLT/ GVPGTD PKRGGRK PGQSGQETQGPT VWSGPESP LQPKP * E 
RQE/VGAGASSGVGLSRGRAGGPS3AWBVAAMLLLLRHGSHSEL 
TDLTBAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCRBACAAASPGLDSAAEPriRiiCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNZiFSYSF*GVARYAC* 

RCPLVL* sgfftii vgg ysccmplxt 


5^17 


44 


1489 


LPTEALRES E W Vsfi YGK£(jPRGLVPEGE S TSPL PSS VDTEDSLD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFERDSE 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 nucleotide 

location 
1 corresponding 

to first 

amino acid 
1 residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide^ 1 
fA^Alanine, OsCysteine, D«Aspartic Acid, B= 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P~Proline, Q=Glutaraine , R-Arginine, 
SoSerinc, T-Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
m \ g possible nucleotide insertion) 








GDS lgarpgXpyglsddh sgggralsabse vbeparopgsarge 
Rpgpacqlcggptgegpcggaggpgggpllpprllyscrlctfv 

SHYSSHXKRHMQTHSGBKPFRCGRCPYASAQLVNLTRHTRTHTG 
EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSPTEQEGAVPRRPEDALLLPDLSLHVPPGGASFLPDCG I 
Q\CGVKGRASAGLDQNHCQS/SLrPWTCRGCGQELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPS DKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGBKPYKCPL 
CPYACGNLANLKRHGRIHSGDXPFRCSLCNYSCNQSMNLIRHM | 


5718 


120 


! 284 


VAHALSLPAESYGNDVSMTHPQLPPTQLAWDLCRTCLPLSYNFT" 
S**STADPLHL 


5719 


48 


| 428 


BLNNGPFQMPLCNGGNIiAVTGSWADRSPLHTAASQGRLIJ^RTLl 
LSQGYNVNAVTLDHVTPLHEACLGDHVACAR7LLEAGANVNAIT 
IDGVTPLFNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 1 


5720 


1 


j 1051 


LQAFRNASEVPMVLVGTQDAISAA\NPRVYimTSRARKLdTDuH 
\ RCT \ YYE \ TCGGT YGLQM WS VS FQDVAQKWAL \RKKQQ\liAI 
GPCK\SLPN\SPSH\SAVSAASIPARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPSISQRELRIETIAASSTPTPIRKQSKRRSNIFTS 
RKGADP\DRE KKAAGCKVDSIGSGRAI PI KQGILLKRSGKSLNK 
EWKKKYVTLCDNGLLTYHPSLHDYMQyiHGICEI DLLRTTVJCVPG 
KRL PRATPATAPGTSPRANGLS VERSNTQLGGGTGAPHS ASS AS 
MSEJPLSSSAWAGPRPEGIJIQRSCSVSSADQWSEATTSLPPGM 


5721 


97" 


492 


RHS S PCCSLRRTERSSNAAV9T/TTVQQFKRFIENYRRHI GCVA[ 
VFYAIAGGLFLERAYYYAFAAHHTGITDTTRVGIILSRGTAASI 
SFMFSYILLTMCRNLITFLR£TFLNRYVPFDAAVr>FHRLIASTA I 


5 722 


88 " 


1043 


VALDVLAGSS PGGGMAQALLGPRVHGIRAVLRV/ARGGVqAPGAP 

gslgvshaaapparpqgaaqsphrgrr:4ggggaglppprsprfp 

qbsvpaststargprrvsrrlppqhpgprgrrrrpgagvgaprr 

grargqagllgrqgqggrgaereraalqarrgrrpgpbpdqscg 

grprraaaapgrapadpqppaprpapapdvrppadapapapapa 

ppppphlgai>tagsgeerqsqpraetlrlgrgaplp\prabrgg f 

rpkqaeqqq\pkrptppargpqssgdpamlpqraglrtgglagt 
ksstreipemi 1 


5723 


88 


1043 


valdvlag3s pgggmagallgprvhg 1 1 ravlrvarggvqapgap 
gslgvshaaapparpqgaaqsphrgrrhggggaglppprsprfp 

QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 

grargqagllgrqgqggrgaereraalqarrgrrpgpepdqscg 

GRPRRAAAAPGRAPADP QPPAPRPAPAPDVRP PADAPAPAPAPA 

ppppphlgaltagsgeerqsqpraetlrlgrgaplp\praergg 

RPKQABQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5724 


3 


1841 


FTNEAPPAPLPil^SASPtSPHRRA«SlJ5RRSTEPSVTPDLiNl^ 
KGWLTKQYEDGQWKKHWFALADQSLRYYRDSVABBAADLDGEID 
LSACYDVTEYPVQRNYGFQ IHTKBGEFTLSAMTSGI RRNWIQTI 
MKHVHPTTAPD VTSSL PEEKRKS 3 C3 FETCPRPTBKQEAELGE P 
DP EQKRSRARE \RRR EGRS KTFDWAEFR P I QQALAQERVGGVGP 
ADTH\DPWRPEAEHGEIiERERARRREERRKRFGMLDATDGPGTE 
DAALRMEVDRSPGLPMSDr,KTHNVHVEIEQRWHQVETTPLREBK 
QVPIAPVHLSSEDGGDRLSTHELTSLLEKELEQSQKEASDLLEQ 
NRLU2DQLRVALGREQSAREG YVLQATCERGPAAMBETHQKKI E 1 
DLQRQHQRELEKLREEKDRLLAEETAAT I SAI EAMKNAHREEME 
RELEKSQRSQ I SS VNSDVEALRRQYLEBI*QS VQRBLEVLS EQ YS 
QKCLEWAHLAQALEAERQALRQCQRENQELNA^QBLNNRLAAE 
ITRLRTIiLTGDGGGEATGSPLAQGKDAYELBVPSGARPCLTQLC 
TQB PQGSAAWPLS YR WGGTDLRQQESQGPGRS KSPEGGEEQ 


5725 


3 


1049 


VWGHSEBTSQSPmTEP^SDdSVDLGIS^TSDI^PQKSCPV^ 
SWKSHSITNM2IGGLKIYDILSDN\DLSSKLQPIK/FTSAVDG 
KNIVRSKAATLLYDQPLQVFTGSSSSSDLISGTKAIFKFDSNHN 
PB/GAKYNKRPHKWAHNIiHLKYMVLHSl ISNTVAV\RSQRHFVA | 
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SEQ 
ID 
NO: 


PredicLed 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, Q*Glycine, 
H^Histidine, Islsoleucine, K^Lysine, 
L«Leucine, M-Methionine, 2J=Asparagine , 
P=Proline, Q=Glut amine, R«Arginine, 
S=Serine, ^Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X=unknown, *»stop 
codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IjQTKS PNRPCQFSSSAPS / VDQRAQ/ INQS YAKHSANMNFSNHN 

NVRANTAYHLHQRLGPARHGEMWAISPNDRLI PAVTRSTIQRQS 

SVS STAS VNLGDPGSTRRAQI PEGDYLS YREFHSAGRTPPMMPG 

SQRPLSARTYSIDGPNASRPQSARPSINEIPERTMSVSDFNYSR 
TSP 


5726 


2 


486 


j SRS LSMW WNSGLPASSHS S KLP VT VG FS GCVKRLRLHGRP LGAP 
! TRMAGVTPCILGPLRAGLFFPGSGGVITL/BSVGAG1PGPSRAG 
I QGS PGGSGEGPPLSSPSQPLPADLPGATLPDVGLELEVRPLAVT 
GLI FHLGQARTPP YLQLQVTEKQVLLRADDG 


5727 


21 


221 


RPILILKETRRLPWATGYAEVINAGKSTHNEDQASCEVTiTVKKK 
AGAVTSTPNRNSSKRRSSLPNGE 


5728 


2 


877 


GTRNGQFEPRRGRAWSGSAGGLRAPGAAAGGPGVQPRGSG/tP<i 
NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GG PAGAGGDAG/liPGRCPSAPWRAGSRPAASCPDNIPGPQGLWL 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GABDPPAED 
EPPQVPEAOEEDAVPABEGPGGTPETQADQVRERPEAHLAEGGA 
KGSPRRLADPQDIjPAGQMSLAPPFPPVAAVIRSNK 


5729 


1 


152S 


aggarevltlui^hfagfvga^^nqqdaalgratdske^pgfX" 
cpdvlyrtgrtlhgqetytprlilmdlkgslsslkeegglyrdk 
qldaaiamqqk1.tthkeelypknpylqdflsaegvlssdgvwrv 
ks i pngkgss plptattpkplipteas i rvwsdflrvhlhprsi 

CMIQKYNHDGEAGRLEAFGQGESVLKEPKYQEELEDRiHFYVEE 
CDYLQGFQILCDIiHDGFSGVGAKAAELLQDEySGRGIITWGLLP 
GPYHRGEAQRNIYRUjNTAFGLVHLTAHSSLVCPIjSLGGSLGLR 

peppvsfpylhydatlpfhcsailataldtvtcs\ yrlcss pvs 
mvhl\adt4lsfcgkkwtagaiipfplapgqslpdslmqfggat 
pwtplsaggepsgtrcfaqswlrgidracrtsqltpgtpppsa 

IiHACTTGEE IIAQYLQQQQ PGVMS SSHLLLTP CRVAPPYPHLFS 
SCS PPGMVLDGS PKGAAVES VPVFG 


5730 


1258 


1713 


KKFQAPARETCVECQKTVYPMERZiLANQQVFHI S CFRCS YCNNK 
LSLGTYASLHGRIYCKPHFNQLFKSKGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRWFENCGRPLKSPGGEDCP3C*GGCPGSNY*AQ 
GSS SREKG3QASKNPKLRVA 


5731 


122 


443 ' 


RSHRGE L I PKDS CYMRKPPRRP KKRRQG 1 CAtiPQG CLTFKDVA I 
EFS LEE W KCLNPAQRALYRAVMLENYRNtiESVGLTSKDS W YMRK 
KPGRGRGKQRRQEWFPLRVY 


5732 


226 


772 


PPSRS CQSPRRKS RRRAHVXVTLVCGFTSFSFSLPLYLCGCLRF 
PERTCS QK2QADWAPDFGPS SFVPS WGATATGARKFLI A FNI \N 
IXGTKEOAHRIALNIJ^GRGKDQPGRLKKVQGIGWYIrDEKNLA 
QVS TNLLDFEVTA1»HTVYE ETCREAQELS LPVVGSQLVGLVPLK 
ALLDAA 


5733 


1 


460 


P ALQE VNA>IALAWGKQYENDARTLFEFTSGVNDTES P 1 1 YRDES""" 
MRTACSPDGliCSDGNGLEL KCP FTSRD FMKFRLGGFEAIKSAYM 
AQVQYSMWVTRKNA!7YFANYDPRMKREGLHYVVIERDEKYM\AS 
FDEI \VP\ SFIGKMDEVLSRDPM 


5734 


3 


968 


RCNSPESLTSLLVLLTTANNLFVLiPAV^KNRAYAlFFIVFTVI ' 
GSLFLMNLLTAI I YSQFRGYLMXSLQTSLFRRRLGTRAAFE VLS 
SMVG EGGAFPQAVGVKPQNLLQVLQ KVQLDSSHKQAMMEKVRS Y 
GSVLLSABEFQKIiFNELDRSWKEHPPRPEYQSPFLQSAQFLFG 
«* xr Ua bUWUl/UiANIjvS IC VFLVLDADVLPAERDDFI LGILNC 
VFI VY YLLEMI^ICVFALGLRGYI^ Y PS tTVFDGIiLTVVL LVLE I S 
TL\VCTDCHTQAGGRRWW/RIiLSLWDMTRMLNMr,IVFRFLRIIP 
SMKPMA WAS TVLGL 


5735 


2 


540 


FFTPCVARAFNFPDQATVKKAAYSIiPRVGGGTS CGLPQARRISL " 
ATPRQLYK/ SSNMTQRWQRRE I snfe ylmflntiagrtywdlnq 
YPVFPWVIiTNYES EBLDLTLPGNFRDLS KP IGALNPKRAVFYAE 
RYETWBDDQSPPYHY1JTHYSTATSTLSWLVRIVSIFIELACLWY 
LKILT 


5736 


1 


382 


GTRPSTKKSGYSPQgVA\aHCTGHQKF^TAVAHSNQKADSAAQ^ 
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ID 

NO: 


beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment: containing signal peptide 
<A= Alanine, C=Cysteine, DsAspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
HsHistidine, I=Isoleucine, K*Lysine, 
l.=Leucine, M=Methionine, NaAsparagine, 
P^Proline, Q^Glutamine, R»Arginine, 
S*Serine, T«Threonine, VoValine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=SCOP 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARl^VTPPNLLPTVSFPQ&bulPtiNPVYSTTIBKIJ^DU^KN 
QES* * ILPDSGIPI P *T*TS YLQSTTHLRRAKLPQLLRR 


573 7 




1041 


KACLHLLSS FLTSNPLFNPIiLPDSLYSVEARSQRAWLGPCRRKR 
LQTLMRLAAGFQYS3HKDPSLSAKEKETDYHNEARGPWPGWVG* 
RTADGSCGRGPDGAHHPGPXSSSWRASRLLPGliGGSHHLDAYVG 
RDLECGTP APLQLB I P PQ PRGHPAP IFTGQAG PRDS G PGAS P* V 
ETR PLTDGRR * PGVR PVGWTPAHPAGTLRP RGAVBPS VSACGKW 
APS PTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 


B 


460 


DixiSLNCTbPETI»PMTPSP*LSFL*FPGLARAKSIPTKTYSNEV 
VTL>fYRPPDlLLGSTDYSTQIDMW*GQVEVWQGPCGKGGGLVTT 
ATQ PAAFLFTVPSLPRGVGCI FYEMATGRPLPPGSTVEEQEiHFI 
FRI ^SEEAWALCAVETHR 


5739 


1 


1222 


SFQRRGIRVWVHTLHPHPRAVWAGIGRGHGS*AIXGRA3iAPALC 
FP TLLEFL ES LE PDLPALRAMG LHLWAAGPGTHPAG I SDLLAEV 
SAEVDGPVPGYLSSPQSITDTCXYIFTSGTTGLPKAARISHLKI 
LQCQGFYQLCG VHQED VI YLALPLYHMSGS LLGI VG CMG I GATV 
VL KS KFS AGQ F W ED CQ QHR VTVFQ Y I GELCR YLVNQ P P S KAERG 
HKVRIAVGSGLRPDTWERF VRR FGPLQVLETYGLTEGNVATINY 
TGQ RG AVGRAS WLYKH I FP FSLI RYDVTTGE P I RDPQGHCMATS 
PGE PGLLVAP VSQQS P FLG YAGG PE LAQ G KLLKDV FRPGDVFFN 
TRDLLVCDDQGFLRFHDRTGDP FRWKGENVATTBVABV FEALDF 
LQE V N V YGVTV 


5740 


265 


231 


PAYW LKVPT LCLESKTDLREKAS HVSAQLQGE VRGIiAGALWM* A 

YVYERVYN*NISRMVHALBQKRHPAGLSSSMALQLN^ 

LQS E LHKLYDEETQS WVS G SACGG Y P 


5741 


1 


650 


PRKTMRRGVLMTLLQQS AMT JUPLW IGKPGDRPP PLCGA I PASGD 
YVARPGDKVAARVKAVDGDEQWIIAEWSYSHATNKYEVDDIDE 
EGKERHTLSRRRVIPLPQWKANPETDPEALFQKEQLVIALYPQT 
TCFYRALIHAPPQRPQDDYSVLFEDTSYADGYSPPuNVAQRYVV 
ACKEPKKK*CRIiADSPSPNDTGQDSRGRAGIKHIPPLKKK 


5742 


2 


362 


TQSVKEILKRNPNVNLTDKDGNTALMIASKEGHTEIVQDLLDAG 
TYVN I PDRSGDT VI»IGAVRGGHVB I VRALLQKYAD I DIRGQDNK 
TALYWAVEKGNATMVRDILQCNPDTEI CTKDG 


S743 


2 


415 


GKTP EG I DAI EE IE I DLEETEREI S P Q ENG LIE E VK PLG EMQTDI* 
KATGREISPRBKTPEVIDATEEIDKDLEETGRRBISPEBNGPEE 

VKPVDEMETDLKTTGREGSSREKTREVIDAABVIBTDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTSPTTTRQMTTTPAAIjPTTVVTTPDLTTGTPLQMTTIA 
VFTTANTCI^^TPSTIiPEBATGLLTPEPSKEGPIIiTAESBTVLP 
SDSWSSAESTSADTVUJTSKESKVMDLPSTSHVSiyiWKTSDSVSS 
POPGASDTAVPEQNKTTKTGQMDGIPMSMKNEMPISQLLMHAP 
SLGFVL FALFVAFLLRGKLMETYCSQKHTRIZ)YIGDSKNVLNDV 
QHGREDEDGLFTL 


5745 


1400 


599 


GKSREWLMKHSKKTYDSFQDELEDYIKVQKARGLEP^CFMJT" 

KGDYLETCGYKGEVNSRPTYRMFD0RLP3ETIQTYPRSCNIPQT 

VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFNQQE 

YICGSHGVEHRVYKHFSSDNSTSTHQASHKQIHQKRKRHPEEGR 

EKSEEERS KHKRKKS CEEIDLDKHKS IQRKKTEVE IETVHVSTE 

KLKNRKEKKSRDVVSKKEERKRTKKKKEQGQERTEEEMLWDQSI 


5746 


3 


B21 


S FASGRJjTPSSPAFDGEIjDLORYSNGPAVSAWQLnMf3avcw7?EvD~" 
RAGERRFPCPVCGKRFRFNS ILALHLRTHQPERPRSPAARLLLE 
LEERAIiLREARLGRARSSGGMQATPATEGIiARJpQAPSS SAFRCP 
YCKGKFRTSAERERHLHILKRPWKCGLCSFGSSQEEELLHHSLr 
AHGAPERPLAATSAAPP PQP QPQP PPQPE PRSVPQPEPEPQ PER 
E ATPTPAPAAPEEP PAP PEFRCQVCGQS FTQS WFLKGHMRKH KA 
SFDHACPV 


5747 


2 


1328 


DRHVETLCIHFIX3PSTGSTAKTGGRNWLKTGNCLYGNTOiFVHG 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRQDVDTEPQKRNTE 
BS5S PVR KESSRGRHRSXEDI KI TKERT PES BEENVE WETNKDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location . 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E° 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leueine, MoMethionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
SaSerine, T»Threonine, V-Valine, 
W« Tryptophan, Y-Tyrosinc, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGDINYDYVHELSLEMKRQKIQRBLMKDEQBNMEKREEIIIK " 

KBVSPEWRSKLSPSPSLRKSSKSPKRKSSPKSSSASKKDRKTS 

AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPBDIALGKKyKB 

KYKVKDRIEEKTRDGKDRGRDFERQREKRDKPRSTSPAGQHHSP 

1 S S RHHSSS 3 QSGS5 1 QRH5 P SPRRKRTPS PS YQRTLTP P LRRS 

ASPYPSHSLSSPQRKQSPPRHRSPMREKGRHDH3RTSQSHDRRH 

ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDRBPRDGRDR 

RE 


574 8 


934 


473 


segpqvfykgijvptliaifpyaglqf*scV6$lkhlykWaipabg 

KKNENLQNLLCGSGAGVISKTLTYPLDLFKKRLQVGGPEHARAA 
FGQVRRYKGLMDCAKQVLQKEGAXGFFKGLSPSL1JCAALSTGFM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


QFPVDPRVRGSTLSLAERPKGMIRSGSFRJDPT/DDVHGSVLSLAS " 
SASSTYSSAEERMQSEQIRKLRREIjESSQEKVATL1SQLSAN7AN 

i»v7aafeqslvkmtsrlrhliaetaeekdtelldlrettdfl*kkkn 
seaqaviqgalwasettpkelrikrqnssdsisslnsitshssi 
gsskdada 


5750 


22 


866 


I FI S I CLWNAHLCFLLL PKDCIDQVMKLQNIjFVDDSGR Y1A.I QF 
IILBWAYVFLYYYEYRKAKDQLDIAKDISQLQIDLTGALGKRTRF 
QENYVAQLILDVRREGDVLSNCEFTPAPTPQBHIiTKNLELNDDT 
ILNDIKLADC3QFQMPDLOVEBIAIILGICTNFQKNNPVKTLTE 
VELLAFTS CLLSQPKFWAIQT5ALI LRT KLEKGSTRRVERAMRQ 
TQALADQFEDKTTS VLBRLKI FYCCQVP PHWAIQRQLASLLFEL 
GCTSSALQIFEKLEMWH 


5751 


3 


751 


SCGSALRAWRCGAAALAT FPAP AL PGLMYRAL YAFRS AE PNALA 
FAAGETFLVLERSSAHWWLAARARSGETGYVPPAYLRRLQGLEQ 
DVLQAI DRAI EAVHNTAMRDGG KYS LEQRG VLQKL ITHHRKETLS 
RRGPSASSVAVMTSSTSDHHLDAAAARQPNGVCRAGFERQHSLP 
SSEHLGADGGLFQIPLPSSQIPPQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 

> 


471 


GPVCGVGJLSVAWAGPWRGPVHSVGGGGRAAL^GAELPCLSGAAT " 
VEREMELRHKNEMLRVETE ARARAKAER ENADI IREQIRLKASE 
HRQT VLES I RTAGTLFGEG FRAFVTDRD KVTAT VN I F I KQG WQ V 
AERQHVGASWSPRSCPCRLCTAXi | 


5753 


34 


483 


DDSXAI PGGVQAP FGAVRNI YTPRTGHRIRKLDQ IQSGGNYVAG "" 
GQEAFKKLNYIjDIGEXKKRPMEWNTEVKPVIHSRINVSARFRK 
PLQEPCTIFLIANGDLINPASRLLIPRKTLNQWDHVLQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 


331 


TLVHWE FAGEHAEAI ASREQEVIiQGW KELLS ACEDARLHVS S T 
ADALRFHSQVRDLLSWMDG IASQIGAADKPRCPSSLLGLPAS PW 
WPTPATPS PLTAPFSME 


5755 


3 


988 


LGDQFYKBAIEHCRSYNSRLCAERSVRLPFLDSQTGVAQNNCYi' 
WMBKRHRGPGLAPGQLYTYPARCWRKKRRLHPPEDPKLRLLEIK 
PE VELP LKKDG PTS E STTLEALLRGEG VEKKVDAREEES I QE IQ 
RVLENDENVEEGNEEEDLEEDI PKRKNRTRGRARGSAGGRRRHD 
AAS QE DHDKP YVCD I CG KRY KNR PGL S YHYAHTHLAS EEGDEAQ 
DQETRS P PNHRNENHR P QKG P DGTVT PNNYCD FCLGGS NMNKKS 
GRPEELVSCADCGRSAHLGGEGRKEKEAAA 


5756 


3 


621 


SSKLQALFAHPLYNVPEEPPLLGAEDSLIiASQEALRYYRRKVAR" 
WNRRHKMYR2QMNLTSLDPPLQLRLEAS WVQFHLG INRUGLYSR 

HLKLVLRFSDPGKAMFKPMRQQRDBETPVDFFYPIDFQRHNAEI 
AAFHLDRILDFRRVPPTVGR I VNVTKEI L 


5757 


3 


473 


ykdalllpdnhrowfengtlkltdVqkgmdegeylcsvliqpq 
ls isqsvhvavkvp pliqpfefppasigqllyi pcwssgdmpi 
r itwrkdgqvi isgsgvti es kb fmsslq i s s vsiikhngn ytc i 
asnaaatvsrerqlivrvpprfw 


5758 


1 


474 


FRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNQSNRLAVSRTDGTVE I YNLS ANY FQE KFFPGHESRATEAL C 
WAEGQRLFSAGLNGE I M E YDLQALN I KYAMDAFGGP I WSMAASP 
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seTq- 

ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of , 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amano acid segment containing signal peptide ~ 
(A«Alanine, CeCyeteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H«Histidiiie, leisoleucine, K-Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, G>Glutamine, R=Arginine, 
SaSerine, T=Threonine, V~Valine, 
(^Tryptophan, Y«Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 


57S9 


2 


1240 


SUaUbLVGCBDGS VKLFQITPDKI PV " ' 

UWiWit HGU<j V V X k. 1 KHMSULPii XTTNGTVHWVNNQ IG FTTDPR 
MARSS P YPTD VAR WNAPIPHVN7ADDPBAVI YVCS VAAE WRNT ? 
NKDVGADLVCYRRRGHNEMDEPMPTQPLMYKQ IHRQVPVLKKYA 
DKL I AEGT VTLQEFE EE I AKYDRI CEEAYGRS KDKK X LH I KHWL 
DSPWPGFFNVDGEPKSMTCPATGIPEDMLTHlGSVASSVPliEDF 
KIHTGLSRIIJIGRADOTKNRTVDWAIjAEYMAFGSLLKEGIKVRL 
NGQDVERGTFSHREHVLHDQEVDRRTCVPMNHLWPDQAPYTVCN 
SSLS E YGVLG FBLG YAMAS PNALVLWEAQFGDFHNTAQC 1 1 DQF 
I STGQAKW VRHNGIVLLLPHGM EGMG PEH5 SARP ER FLQMS NDD 
SDAYPAFTKDF3VSQL 


576C 


1 


<L4«1 


VRDI TSDSLS I*S WTVPEGQFDKFliVQFKNGDGQPKAVRVPGHED 
GVTI SGLEPDHKYKMNLYGFHGGQRVGPVS AVGLTAPGKD E EMA 
PASTEPPTPEPPlKPRtiEEIjTVTDATPDSLSLSWTVPEGQFDHF 
LVQYKNGDGQPKATRVPGHEDRVTI SGLE PDNKYKMNLYGFHGG 
GRVGP VSAIGVTAAE BETPTPTEPSMEAPB PPE E PLLGE LTVTG 
SSPDSLSLSWTVPQGRFDSFTVQYKDRDGRPQVVRVGGEESEVT 
VGGIiEPGlUCYKMHLYGLHEGRRVGPVSTVGVTAPQEDVDETPSP 
TEPGTSAPBPPEEPLLGELTVTGSSPDSLSLSWTVPQGRFDSFT 

VQYKDRDGRPQAVRVGGQESKVTVRGLBPGRKYXMHtiYGLHBGR 
RLGPVSAIGVT 




3 " "' 


-275 


SCD^EAAAJjWIRGPGFGCKAVRCASGRCTVRDFIHRHCQDQN 
VP VENFFVKCNGALINTS DT VQHGAVYS LEPRLCGGKGGFG SML 
RALGAQIBKTTNR^CRDLSGRRLRDVhmEKAMAEW^QQAERE 
AEKEQKRLERLQRKLVEPKHCFTSPDYQQQCHEMAERLEDSVLK 
GMQAAS S KM VSAE I SENR KRQWPTKS QTDRGAS AGKRRC PWLGM 
EGLETAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVBMAAK 
FPSGSQRARWNTDHGSPEQLQIPVTDSGRHILBDSCAELGESK 
EHKBSRMVTETEETQEKKAESKEPIEEEPTGAGLNKDKETEERT 
D3ER VAE VAPEER ENVAVAKLiQ ESQ PGIJAV I D KET IDLLA FTS V 
ABLELLGLEKLKCELMALGLKCGGTLQ 


5762 


2 


344 


GS'PGQTPLHSQGGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTI 
MSSEBAANGKKSKWAELEISGKVRSLSASLWSLTHLTALHLSDN 
SLSRI PSDIAKLHNLVYLDLSSNKIR 


5763 


3 


429 


LDKDTGLIMLIARLDYEL1QRFTLTIIARDGGGEETTGRVRINV * 
LDVNDNVPTFQKDAYVGALRENEPSVTQLVRLRATDEDSPPNNQ 
1TYS I VSASAFGS YFDIS L YEG YGVIS VSRPLDYEQ I SNGL I YL 
TVMAMDAGN 


5764 
5765 ' ■ 


19 


441 


VCARACGEMRQLUPiDbORYDENEDLSDVEEIVSVRGFSLEEK 
LiK^^xtZ^ut VnAMEGKDENYE YVQREALRVp LI FREKDGLGIK 

MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTSMSMSQFVRYYE 
TPEAQRDKL 




3 


625 


QKILRUfNSkQPPTSSSNSKDCGGPASSGAGATAALADGL^ 
VQAS APQGWS HKETS KS KVKRS KTS KDANKS LPS AAL YG I PEI S 
STGKRQEVQGR PGEATGMNSALGQS VSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLQ 
GHQNGSGSQAPSGGHLYGFGAKSNGGGAS PFHCGGTGSGSVAAA 
or. v a naMturo MltVKKEEEE EESHRR I KKLKTEKVDPLF 
TVPAPPPHV 




1608 




SGLFS VDPASSQAMBLSDVTLI EGVGNEVMWAGV\A/LILAI*VIj 
AWLSTYVADSGSNO^LGAIVSAGDTSVlJjl^HVDHLVAGQGNPE 
PTELPHPSEGNDEKAEEAGEGRGDSTGEAGAGGGVEPSLBHLLD 1 
IC^LPi0lQAGJW5SSSPBAPI^EDSTCLPPSPGLITVRLKFLND 
TBEIAVARPEDTVGALKSKYFPGQESQMKLIYCGRIXQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLiWPVFVVIjLGVVWYFRINYRQ FFTAPATVSLVGVTVFFSFIjV 
FGMYGR 


5767 


2 


3 


NIFRATPRPPTkPiil^RTGTEVILWYLDWRAI^KRKRMKANIKLVG 
5GFPLPSSDLDDSLTEEIDEKIGFRNDANFDWQNVADFRDAGGS 
bTEVKVEEEBRDPQS PEFE I EEEEEMIiS S VI PDSRREN ELPDFP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secrment containina filarial nAn^-^ A 
(A=Alanine, OCysteine, D-Aspartic Acid, 
Glutamic Acid, K« Phenylalanine, G=Glycine, 
HaHistidine, Ialsoleucine, K^Lysine, 
LsLeucine, M»Methionine, N-Aeparagine , 
PaProline, Q=Glutamine, R^JVrginine, 
S-Serine, T=Threonine , V=Valine, 
W-Tryptophari, Y=Tyrosine, X-Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\»posaible nucleotide insertion) 








HIDSFFTLNSTPSRSAYDEFilJbLVWlEKQKLELEKRRLDIEAER 
LQVEKERLQIEKERLRHLDMEHERLQLEKERLQIEREKLRbQIV 
KSEKPSLBNELGQGEKSMLQPQDIETEKIiKIjERBRLQLBKDRLQ 
FLKF2SBKLQIEK2RLQVEKDRLRIQKBGHLQ 


5768 


3 


476 


SSRSRLSV3VSPPPPGIVELGPPFAWEFCSRLGSAVTSQRAGPA " 
AAMVAKDYPFYLTVKRANCSLELPPASGPAKDAEEPSNKRVKPL 
SRVTSLANIiIPPVKATPLKRPSQTLQRSISFRSESRPDILAPRP 
KSRNAAPSS TKR RDSKL WSETFDVC 


5769 


38 


667 


TKTKKG VKEKATDQS VKAFAKI 1C PELQ YVGFMGCS VTS KG VIHL 
TKLRNLSSLDbRHITELDNETAMEIVKRCKNLISUJLCLNWIIN 
DRCVEVTAKEGQNIiKEL YLVSCKITDYALIAI GRYSMTI ETVDV 
GWCKE 1 TDQGATLIAQS SKSLR YLGLMRCDKVNEVTVEQLVQQY 
PHITFSTVLQDCKRTLERAYQMGWTPNMSAASS 


5770 


1 


484 


DSRRYDVK'rRKWSFLLEEHSKLIAKVRCLPQVQLDPLPTTLTLA 
PASQLKKTSLSLTPDVPEADLSEVDPICLVSNLMPFT)RAf:vwiraT 

AKGGRLLLADDMGLGKTIQArClAAFYRKEWPLLVWPSSVRFT 
WEQAFLRWLPSLSPDCINVWTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQDTFEACYSGTStfPS" 
FHGSHCSGSDHSSLGLEQLQDYMVTLRSKLGPLEIQQFAMLLRE 
YRLGLPIQDYCTGLLKLYGDRRKFLLIjGMRPFIPDQDIGYFEGF 
LEGVG IREGGILTDS FGRI KRSMSSTSASAVRSYnr: A>no PPin 
AFHRLLADITHDI3 


5772 


148 


383 


EFNIALVSPSHPQIKAEDDQPLPGVI.LSLSGGLFRSNLIjTQDNG 
ILTFSNLVTCSAI YHLPVFPERB PGCSMRDLRVA 


j 5773 


2 


723 


prvrskhnkcfmemntrlqvehpvtemitgtdlvewqlriaage " 

KIPLSQEEITLOGHAFEARIYAEDPSMMFMPvanm \rur e?T»non 
DPSTRI ETGVRQGDEVSVHYDPMIAKLWWAADRQAAJCTKLRYS 
LRQYNIVGLHTNIDFLLNLSGHPEFEAGNVHTDFIPQHHKQLLL 
SRKAAAKESLCQAALGLILKEKAMTDTFTLQAHDQFSPFSSSSG 
RRLNISYTRNMTLKDGKNSK 


5774 


2 


592 


FVEBENIRVVRCGGSELNFRRAVF;?An<?KVTPr^<irtnTrx/TrwVc'r 
VTEECVHILHGHRNLVTGIQIiNPNNHLQIjYSCSLDGTIKLWDYI 

dgili ktftvgcklhaiiftlaqaedsvfvtvnkekpdlfqlvsv 
klpksssqrveakelsfvldyinqspkciafgnegvyvaavref 
yls vyffkketts r vtls s s 


5775 " 


3 


£38 


SSGCOJFAAPSSIAEAATMPVSKCPKKSBSLWKGWDRKAQRNGL 
RSQVYAVNGDYYVGEWKDNVKHaKGTQVWKKKGAIYEGDWKFGK 
RDG YGTLSLPDQQTGKCRRVYSGWWKGDKKSGYG I QFFGPKEYY 
EGDWCGS QRSGWGRMYYSNGDI YEGQWENDKPNG EGMLRLSQNP 


5776 


2 


484 


RLPQDCVCQI^BSLGTLCPSK^LLFVPPDIDRRTVELRL'GGNF 
IIHISRQDFANMTGLVDLTLSRNTISHIQPFSFLDLESLRSIiHL 
DS NRL PS LGE DTLRGLVNLQHL I VNKNQLGGI ADEAFEDFLLTL 
EDLDLSYHNLHGPAVGLRGDAWVQPSTS 


5777 


2 


949 


GQDPEPGQDLFQPEREVDPSWGRGREPRLGKI»RFQNDHtSVLKQ 
VKKLEQALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPENLG 
GGSGSEVSQRVHPSDLEGREPTPELVBDRKGSCRRPWDRSLBNV 
YRGSEGSPTKPFINPLPKPRRTFKHAGEGDKDGKPG I G FRKBKR 
NLPPLPSLP P PPLP S SPPPSS VNRRLWTGRQK5SADHRKS YE FE 
DLLQSSSESSRVDWYAQTKLGLTRTLSEENVYEDILDPPMKENP 
YEDI ELHGRC LGKKCVLNFPAS PTSS I PDTLTKQS LS KPAFFRQ 
NS2RRNV 


5778 


1 


1210 


QRRQSVSRLLLPVFLLEPPAEPGLEPPPEEEGGEPAGVAEEPGS 
GGPCMLQLEEVPGPGPLGGGGPLRSPSSYSSDELSPGEPLTSPP 
WAPLGAPERPEHLLNRVLERLAGGATRDS AAS DILLDDI 7LTHS 
LFLPTEKFLQELHQYFVRAGGMEGPEGLGRXQACLANLLHFLDT 
YOGLLQBEEGAGHIIKDLYLLIMKDEStiYQGLREDTLRLHQLVE 
TVELKIPEENQPPSKQVKPLFRHFRRIDSCLQTRVAFRGSDEIF 
CRVYMPDHS YVTIRSRLS ASVQD I LGS VTEKLQ YSE EPAGREDS 
LZLVAVS SSGEECVLLQ PTEDC VFTALG INSHLFACTRDS YEALV 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nuv-j.eo t iae 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenyl alanine, G»Glycine, 
H=Histidine, Ialsoleucine, K- Lysine, 
L^Leucine, M-Methionino, N=Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S-Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLPEEIQVSPCiUTEIHRVEPBDVANHLTAFHHELFRCVHS'LEFV 
DYVFHGB 


5779 


138 


1671 


EAVQVLI KH 3 AD VNARD KNWQT P LHVAAAN KAVKCAEVl I PLLS 
SVNVSDRGGRTALHHAALNGHVEMVNLLLAKGANINAFDKKDRR 
ALH WAAYMGHLD WALL INHGAEVTCKDKKGYTPLHAAAS NGQ I 
NWKKLLNLGVEIDBINVYGNTALHIACYNGQDAWNELIDYGA 
NVNQPh^JGFTPLHPAAASTHGALCLELLVNNGADVNIQSKDGK 
S PLHMTAVHG RFTRS QTLI QNGGB ID CVDKDGNTPLHVAAR YGH 
EI^INTLITSGADTAKCGIHSMFPLHLAALNAHSDCCRKLLSSG 
QKYS I VSLFSNEH VLSAGPEIDTPDKFGRTCLHAAAAGGNVECI 
KLLQS SGADFHKKDKCGRTPLHYAAANCTFHCIBTLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKT I LGNAHDNSBBLBRARELKBK 
EATLCLEFLLQNDANPS I RDKEG YNS IHYAAAYGHRQCLELLLE 
RTNSGySESDSGATKSPLHLAVSEMP 


5780 


154 


624 


QFFRVITCLPFKGPDYRLYKSEPBLTTVAEVDESNGEBKSEPVS 
EIETSVVKGSHFPVGWPPRAKSPTPESSTlASYVTLRK~KKm 
DLRTERPRSAVEQLCLAESTRPRMTVEEQMERIRRHCQACLREK 
KKqLNVIGASDQSPLQSPSNLRDNP 


5781 


19 


941 


RGSLGGHPWRPPMRAASQGCLPVSFVTGPHQERAYGGRGPGGAF"" 
PAPPVSGTCPPDLIYAPTPEKAEGGSQKNHQPPPGERAAHRDGE 
QAP CRAG PTRKVAVAPRP PS CP *GPE \ PGEEPRRPLDRS P PLGQ 
V^PHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 

qhsihtvtcksprqkedrspkppqapkhpeehgrqs\qapppl»p 

VAPS RTCGGC * TWDPALLVS P / PQGDSTPELPAP \QQPTGG PS R 

CRQALPPQG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


S782 


5176 


1237 


D RSMMS MAADS YTDS YTDT YTEAYM VPPLPPEEP PTMP PLPPEE 
PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 
SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
PEP ESS I TLTFVES AWAEEHEWPERP VTCMVS ETPAMSAEPT 
VLASE P PVMSETAET FDSMRAS GHVAS E VSTSLLVPAVTTP VLA 
ESILEPPAMAAPESSAMAVLESSAVTVLESSTVTVLESSTVTVL 
EPSWTVPEPPWAEPDYVTIPVPWSALEPSVPVLEPAVSVLQ 
PSMIVSEPSVSVQESTVTVSEPAVTVSEQTQVIPTEVAIESTPM 
ILESSIMSSHVMKGINLSSGDQNLAPEIGMQEIALHSGEEPHAE 
EHLKGDFYE SEHG IUI DLNI NNHLI AKEMEHNTVCAAGTSP VGE 
I GB EKI LP TSETKQRTVLDT YPGVSEADAGETLS STG PFALE PD 
ATG \TS KG I 2FTTAS TLSLVNKYD VDLS LTTQDTEHDMLIS TS P 
SGGSEADIEGPLPAKDIHLDLPSNINLVSSD1NEPLPVKRD\DQ 

tlaali\sl:<essggekevppps*rehlpdsgfsaniedinkad 

LVRPVSS PRTWNVLPS PRAGL\EGP\LLASDFGPVQNLYSSPW 
\SSMP\ERASGS\SSGEKGG\YEIFVKVXDTHEKSKKNKNRDKG 
EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 
HRS\QTRSRSRS/RBRRRRSSRSRSKSRGRRSVSKEKRKRSPKH 
R5 KSRERKR KRSS SRDNRKTVRARSRTPS RRSRSKTPSRRRRSR 
SVGRRRSFSISPSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 
RT PS RRSRTPSRRRRSRS WRRRS FSIS PVRLRRSRTPLRRR FS 
RSPIRRKRSRSSERGRSPKRLTDLDKAQLLEIAKANAAAMCAKA 
GVPL P PWLKP AP PPTI EEKVAKKSGGATI EELTBKCKQ I AQS KE 
DDD V I VN KPHVS DEEEEEPPF YHHPFKLS E PKPI FFNLNl AAAK 

PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 
KDDDNVPSSNLPSEPVDrQTaMQPDar.anvDT o cxm t?™ 

LNRAQER I DAWAQLNS IPGQFTGSTGVQVLTQBQLANTGAQAWI 
KKDQFLRAAPVTGGMGAVLMRKMGWRBGEGLGKNKEGNKEPILV 
DFKTDR IOGLVAVGERAQKRSGNFS AAMKDLSGKHP VSALMEICN 
KRR WQP PE FLLVHDSGPDHRKHFLFRVL INGSAYQPNCM FFLNR 
Y 


5783 


1693 


698 


DSGLRVAFTMEGISNFKTPSKLSEKKKSVLCSTPTINIPASPFM 
QKLGFGTGWVYIJviKRSPRGLSHSPWAVKKIKPICNDHYRSVYQ 
KRLMDSAKILKSLHHPWIVGYRAFTEANDGSIX^AMEYGGEKSL 
NDLI EE / PI * SQ/ PKILFQQP/L I LKVALNMARGLKYLHQEKKL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G«Glycine # 
HaHistidine, I=Isoleucine, K» Lysine, 
L~Leucine, M«Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
SaSerine, T=Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *»stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGDI KS SNWI KGDFET I KI C DVGVS LPLDEIWTVTDPEACYI 
GTEPWKPKBAVEEMGVITDKADIPAFGLTLWEMMTL3IPHINLS 
NDDDDEDKTFDESDFDDEAYYAALGTRPP INMEELDESYQKVIE 
LPS VCTNED PKDRPSAAKI VEALETDV 


tin oa 
3 ton 




1388 


prvrprvrtdhnyyisriygpsdsasrdlwvnIixjmekdkvkih 
gi lsnthrqaarvnls fdfpfyghflre itvatggf i ytgewh 
rm ltatq y i aplmanfdpsvs rnstvr yfdngtalvvqwdhvhl 
qdnynlgs ftfqatllmdgri i fgykbipvlvtqisstnhpvkv 

GLSDAF WVHRI QQIPNVRRR? I YEYKRVELQMSKI TKTI SAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCFEE£KEKMCENTEPVET\FLEPPQP*3RQPPSSGS*LPP 
E / DAVTSQFPTS LPTBDDTKI ALHLKDNGASTDDS AAEKKGGTL 
HAGLI VGILILVLI VATAILVTVYMYHHPTSAASI FFI BRRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5785 


266? 


1388 


PRVRPRVRTUHNYYISRIYGPSDSASRDLWVNIDQMEKDKVKIH 
GI LSNTHRQAARVNLS FDFPF YGH FLRE I T VATGGFI Y TGE WH 
RMLTATQYIA? LMANFDPS VSRNS TVR YFDNGTALVVQWDHVHL 
QDNYNLGSFTFQATLLMDGRIIFGYKEIPVLVTQISSTNHPVXV 
GLSDAFVWHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCPEESXEKMCBNTEPVET\FLEPPQP+ERQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTK IALHLKDNGASTDDSAAEKKGGTL 
KAGLI VGI LILVLIVATAI LVTVYMYHHPTSAASIFFIERRPSR 
WPAMKFRRGSGHPAYAEV2PVGBKEGFIVSEQC 


578<£ 


25^52 


1674 


SYKLPAAERRASSCSQP PTPTRRRWPAPGRTSRGHRPQM * SGTP 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*SIjN*M 
S*H*KRNI^QRSSSMSRRPLSCARPHR**RQGLTVAARLPTWAK 

spplacsfcqaaqksqslssgrstr*permsfrp\sppgnpaip 
slapssrp/pkgrpqctwi p srwpas ptappttt*aptss pgst 
grs mmtcptr wtatpwsarassrprwwptp * wrpsgrlstv * ra 

TGGSTATAPPKRFPRNWNPMMAE 


5787 


2 


1460 


MAS AAS VTSLADEVNCP \ ICQGTLKEAGSLSNCG/HKNFCRACL ~ 
T\RYCE1P\GPDVLEBSP\TCP\LCKEPFRP\GSFRPNW0LANV 

venibrlqlvstlglgeedvcqehgekiyffceddemqlcwcr 
eagehathtmrfledaa\apyreqihkclkciii kbreb iqeiqs 

RENKRMQVLLTQVSTKRQQVXSEFAHLRKFLEEQQSILLAQL3S 
QDGD I LRQRDEFDLLVAGE I CRFSALI EELEE KNERPARELLTD 
IRSTLIRCETRKCRKPVAVSPELGQRIRDFPQQALPLQREMKMF 
LEKLCFELDYEPAHTSLDPQTSHPKLLLSEDKQRAQFSYKWQNS 
PDNPQRFDRATCVLAHTG I TGGRHTV7WS X0LAHGGS CTVG WS 
3DVQRKGELRLRPEEGVWAVRLAKG FVS ALGS FP \TRLTLKEQP 
RQ VRVSL D YEVGWVTFTNAVTRE P I YTFTAS FTRJCVI PFFGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHSVSGRSSAyGDATAEGUPAGPG^VSSSTGAISTlTGHQEGDG " 
SEGEGEGS TEGDVHTSNRLHMVRLPCiLERLLQTLPQLRNVGGVR 
AIPYMQVIIMLTTDLDGEDEKDKGALDNIJLSQLIAELG^KKDV 
SKKNE RS ALN E VH LWMRLLS VFMS RTKSGS KSS I CESSSL ISS 
ATAAALLSSGAVDYCLHVLKSLLEYWKSQQNDEEPVATSQLLKP 
HTTSSPPDMSPFFLRQYVKGHAADVFEAYTQLLTEMVLRLPYQI 
KKITDTNSRI PPPVFDHSWFYFIiSE YLMIQQTPFVRRQVRXLLL 
F I GGSKEKYRQLRDLHTLDS \HVRG I KKLLEEQG I FLRASWTA 
:> ^U^lHXjU I Lr± Ij x &JjMEHLKACAE IAAQRTINWQKFCIKDDSVLY 
FLI£VSFLVDEGVSPVLI£LIjSCALCGSKVUtf^^ 
SSPAPVAASSGQATTQSKSSTKKSKKEEKEKEKDGETSGSQBDQ 
LCTALVNQLNKFADKETLIQFLRCFLLESNSSSVRWQAHCLTLH 
I YRNSSKS QQEXLLDLMWS I WP E LP AYGRKAAQFVDLLGYFSLK 
TPQTEKKL KE YSQKAVE 1 LRTQNHILTNHPNSN I YNTLSGLVEF 
DGYYLES DPCLVCNNPE VPFCYI KLSS I KVDTRYTTTQQWKLI 
GSHTIS KVTVKIGDLKRTKMVRT INL YYNNRTVQA3TVELKNKPA 
RWHKAKKVQLTPGQTEVKrDLPLPIVASNLMIEFADFYENYQAS 
TETLQCPRCSASVPANPGVCX3NCGENVYQCHKCRSINYDBKDPF | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C»Cysteine, D=Aspartic Acid, B« 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
H*Histidine, I*Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S<=Serine, T=Threonine, V=Valine, 
N=Tryptophan, Y-Tyrosine, X-Unknown, +=:Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LCNACGFCKYARFDFMLYAJCPCCAV^PIENEBDRKKAVSNINTL 
LDKADRVYHQIMGHRPQLENLLCKVNEAAPEKPQDDSGTAGGIS 
STS ASVNRY ILQLAQEYCGDCKNS FDBLSKI IQKVFASRKELLB 
YDLQQREAATKSS RTS VQPTFTASQYRALS VLGCGH TSS TKCYG 
CAS AVTEHC ITLLRALATNPALRHILVSQGL IRELPDYNLRRGA 
AAMREE VRQLMCLLTRDNPEATQQMNDL I IGXVSTALKGHWANP 
DLASSLQYEMLLLTDSISKEDSCWELRLRCALSLFLMAVNIKTP 
VVVE^ITLMCLRILQKLIKPPAPTSKKNKDVPVEALTrVKPYCN 
EIHAQAQLWLKRDPKAS YDAWKKCLP IRG I DGNGKAPS KS ELRH 
LYLTEKYVWRWKQFLS RRGKRTS PLDL KLGHNNWL RQVLFTPAT 
QAARQAACT I VEALAT I PS RKQQVLDLLTS YLDE L S I AGECAAJS 
YLALYQKLZTSAHWKVYLAARGVLPYVGNLITKEIARLLALEEA 
TLSTDLQQG YALKS LTGLLSS F VE VE S I KRHFKSR LVGTVLNG Y 
LCLRKLWQRTKIi I DETQDMLLEMLSDMTTGTES ETKAFMAVC I 
ETAKRYNLDDYRTP VF I FERLCS 1 1 YPEBNEVTBFFVTLEKDPQ 
QEDFLQGRMPGNPYSSWEPGIGPLMRDIKNKICQDCDLYALLED 
DSGMELLVNNKIlSIiDLPVAEVYKKVWCTTNEGEPMRIVYRMRG 
LLGDATEEFIESLDSTTDEEEDEEEVYKMAGVMAQCGGLECMLN 
RLAG I RDF KQGRHLLTVLL KliFS YC VKVKVNRQQLVKLEMNTLN 
VMLGTLNLAX VAEQESKDSGGAAVAEQVLS I MB I \ ICAEPNVEP 
LSEDKGNLLLTGDKDQLVMLLDQ INSTFVRSNPS VLQGLLRI I P 
YLSFGEVEKMQILVERFKPYCNFDXYDEDHSGDDKVFL\DCFCK 
IAAGIK\NNSNGHQL\KDL\ILQKGITQNALD\YMKKHIP/SAA 
R I WDADI \ W KS FCLRPALP FILRLLRGLA IQH PGTQVL I GTDS I 

pnlhkleqvs\sdegigtla\enl\leslrehpdvnkkidaVar 
rbtraeickrmamamrqkalgtlg \mttnekgqwd/trtallea 
dweelieepXgltccicregykfqptkvlgiytftkrwi.ggvw 
enkpretsratstvshfni vhydc \hla\avs largreewesaa 
lqnantkcngllpvwgphvpesafatclarhntylqectgqrep 
tyqlnihdi kl l flrfameqs fsadtggggrbsnihli pyi iht 
glyvlnttratsreeknlqgfleqpkekwvesafevdgpyyftv 

LALHILPPEQWRATRVE IIaRRLLVTSQARAVAPGGATRLTDKAV 

kdysayrssllfwalvdliynmfkkvptsnteggwscslaeyir 
hndmpiyeaaukalktfqeefmpvetfsefldvagllseitdpe 
sflkdllnsvp 


5789 


1 




lplhavektgrpgqpalkmpgklrsdaglesdtamkkgetlrkq 
teekekkekpksdkteeiaeeeetvfpkakqvkkkaepsevdmn 
spkskkakk\keepsqndispktkslrkkkepiekkvvssktkk 
vtkneepseeeidapkpkkwkkekemngetrekspklkngfphp 
epdcnpseaaseesnseieqeipveqkeg\afsnfpiseetikl 
LKGRGVTFLFP iqaktfhhvysgkdli aqartgtgktfsfaipl 
ieklhg\ei^drkrgrapqvlvlaptrei^ovskdfsditkkl 
svacfyggtpyggqfermrngidilvgtpgrikdhiqngkldlt 
klnhvvldbvdqmldmgfadqveeilsvaykkdsednpqtllfs 
atcphwfnvakkymksty^vdligkktqktaitvehlaijkch 
wtqraavigdvirvysghqgrti ifcbtkkeaqelsqnsaikqd 
aqslhgdipqkoreitlkgfrngsfgvlvatnvaargldipevd 
lviqssppkdvesyihrsgrtgragrtgvcicfyqhkeeyqlvq 
veqkagikfkrigvpsateiikasskdairlldsvpptaishfk 
qsaeklieekgaveaiaaalahisgatsvdqrslinsnvgfvtm 
ilqcsiempnisyawkelkeqlgeeidskvkgmvflkgklgvcf 

wvri/ov i e» AynAWriUorCKWyijo VAl tyrtLiEGPREGYGGFRGQ 

REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRSFSKAFGQ 


5790 


3786 


1505 


ARRQRDPLQALRkRNQELKQQVDSLLS^SOtKEALEPNKRQHIY ~ 
QRCIQLKQAI DENKNALQKLS KADBSAP VAN YNQRKEEEHTLLD 
KLTQQLQGLAVTISRENIXEVGAPTEEEBESESEDSEDSGGBEE 
DAEEE SEE KEENES HKWS TGEE Y I AVGD FTAQQ VGD LTFKKGE I 
LLVI EKKPDGW WI AKDAXGNEGL VP RTYLE P YS EEEEGQES SEE 
GS EEDVEAVI)ETADGAEVK\QRTDPHWSAVQKAI SEAG I FCLVN 
HVSFCYLI VLMRWRMETVEDTNGSETGFRAWNVOSRGR IFLVSK 
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SEQ 
ID 
NO: 


beginning' 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M»Methionine, N=Asparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S=Serine, T=»Threonine , V-Valine, 
^-Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








P VLQQ I NT VD VLTTMGAIPAG FR PSTLSQLLEEGNQ FRAN Y FLQ 
PEI^PSQLAl-'RDLMWDATEGTtRSRPSRlSLILTLWSCKMIPLP 
QMS IQVLSRHVRI*CLPDGNKVLSNTHTVRATWQP KKP KTWTFS P 
QVTRILPCLLDGDCFIRSNSASPDLGILFBLGISYIRNSTGBRG 
ELSCGKVFLKLFDASGVPIPAKTyELFLNGGTPYEKGIEVDPSI 
SRRAHGS VF YQIMTMRRQPQLI*VKLRS LNRRSRNVLS LLPETL I 
GNMCSIHLUFYRQILGDVLLKDRMSLQSTDLISHPMtiATFPML 
LEQ PDVMDALRSS WAGQES \ TLKRS EKR \ PK2 FLKVPR FLLVYH 
\GCVLPLL/HTPTRLPPFRWAEEETETARWKVITDFLKQNQENQ 
GALQALLS PDGVHEPFDLSEQTYD FLG EMRKNAV 


5731 


3 


163* 


LrVAE FAGTS R/ 1 GAGLI QPLHRAPARDHGL IiRGGAAPALS VSH 
GN/GKQL/AMSSQGSDDEQIKRENIRSLTM3GHVGFESLPDQLV 
NRSIQQGFCFNILCVGETGIGKSTIiIDTLFNTNFEDYESSHFCP 
NVKLKAQTYELQESNVQLKLTIVNTVGFGDQ INKERS YQ? I VDY 
IDAQFEAYLQEELKIKRSIiFT YHDSRIHVCLYFI S PTGHSLKTL 
DLLTMKNLDSKVTYIIPVIAKADTVSKTELQKFKIKLMSELVSNG 
VQ I YQFPTDDDTIAKVNAAMNGQLPFAVVGSMDEVKVGN KMVKA 
RQYPWGWQVENENHCDFVKLREMLI CTNMEDLREQTHTRHYEL 
YRROCLEEMGFrDVGPENKPVSVQETYEAECRHEFHGERQRKEEB 
MKQMFVQRVKE KEAILKEAERBLQAKFEHLKRLHQEERMKLEE K 
RRLLEEEIIAFSKKKATSEIFHSQSFIiATGSNLRECDKDRKNSOP 
FVKQKVPEHRRSSSQAWFI KKKLEVCFDFAVICFITS I FGEQPQ 
LLI FMEKYFQVQG3YISQS B 


5792 


2263 


653 


AAAAPSPAWWCGVFVVYVVHTCWVMYGIVYTRPCSGDAsCtQPY 
larrpkxqi»\rhs FTTTRSHLGAENNI DLVLNVEDFDVESKFBR 
TVNVS VPKKTRNNGTL YAY I FLHHAGVLPWRDGKQ VHLVS PLTT 
YMVPKPEEINIiLTGESDTQQIEADKKPTSALDEPVSHKRPRLAli 
NVMADNFVPDGSSLPADVHRYKKMIQLGKTVHYLPILFIDQLSN 
RVKDLMVINRSTTELPLTVSYTKVSLGRLRFWIHMQnAVYSLQQ 
FGFSEKDADEVKG I FTOTNL YFLAl/TF F VAAFHLL FD FLAFKMD 
ISFWKKKKSMIGMSTKAVLWRCFSTVVIFLFLLDEQTSLLVLVP 
AGVGAAI ELWJCVKKAL KMT1 FWRGLM PEFQFGT YS ESBRKTEEY 
DTO^U^ KYXSYLL YPLCVGGAVYSLLNI KYKS WYS WLXNS FVNG V 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I ITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGES YE 
EKATRAPHTD 


5793 


2263 


653 


AAAAPSPAWWCGVFWYWHTCWVMYGIVYTRPCSGDASCIQPY"" 
LARRPKLQL\RHSFTTTRSHLGAENNIDLVLNVEDFDVESKFER 
TVNVS VP KKTRNNGTLYAY I FLHHAGVLP WHDGKQVHLVS PLTT 
YMVPKPEE INLLTGESDTQQ IEADKKPTS ALDEP VS KWRPRLAL 
NVMADNFVPDGSSLPADVHRYMKMIQLGKTVHYLPILFIDQLSN 
R VKDLMVINRSTTELPLTVS YDKVSLGRLRFW I HMQDAVYS LQQ 
FGFSEKDADEVKG I FVDTNLYFLALTFFVAAFHLLFD FLAFKND 
ISFWKKKKSMIOMSTKAVLWRCFSTWI FLFLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMTIFWRGLM PE FQFGTYSES ERKTEEY 
DTQAMKYLS YLIi YPIiCVGGAVYS LLNIKYKS WYSPfL INSF VNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWXAFTYKAFNTFIDDVTAF 
1 1 TMPTSHRLACFRD DWFLVYLYQRWLYP VDKRRVNEFGES YE 
EKATRAPHTD 


5794 


1 


5016 

( 

} 


MGPRLSWfljbLLPAALLLHEEHSRAAAKGGCAGSGCGKCDCHGV ~ 

KGQKGBRGLPGLQGyiGFPGMQGPBGPQGPPGQKGDTGBPGLPG 
TKGTRG PPGASGYPGNPCSLPcr ^GanGPPRPP^ST D^rwrTVPCD 

GPLGPPGLPGFAGNPGPPGLPGMKGDPGBILGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQGPVGPPGFTGPPGPPGPPGPPGEKGQM 
GLSFQGPKGDKGDQGVSGPPGVPGQAQVQEKGDFATKGEKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YP GLIGRQG P \QGEKGEAGP PGP PG I VI GTGPLGEKGERG YPGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGBKGD 
RGFPGTS LPGPSGRDGLPGPPGS PGPPGQPGYTNG I VECQ PGP P 
3DQGPPG I PGQPG PI GE IGEKGQ KGBSCL I CD IDGYRGPPGPQG 
PPGE IGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGL IGQPGAK 
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SBQ 
ID 
NO: 


rtvui c tea 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide - ^ 
<A=Alanine, OCysteine, D»Aspartic Acid, 
Glutamic Acid, F*Phenyl alanine. Glycine, 
HsHistidine, I=Isoleucine, K=Lysine / 
L«Leucine, M«Methionine , N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SoSerine, T*Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


S79S 






GKPGfcg^PDt^LKGUKt;DPGFPGQPGMPGRAGSPGRDGHPGt,PG"- 

PKGSPGSVGLKGERGPPGGVGPPGSRGDTOPPGPPGYGPAGPIG 

DKGQAGPPGGPGSPGIiPGPKGBPGKIVPLPGPPGABGLPGSPGF 

PGPQGDRGPPGTPGR\PGL\PGEKGAVG\QPGIGFPGPPGPK5V 

DGLPGDMGP PGTPGRPG FNGLPGNPGVQGQKGE PGVGLPGUCGL 

PGLPGIPGTPGEKGSIGVPGVPGEHGAIGPPGLQGIRGEPGPPG 

LPGS VG SPG VPG I GP PGARGP PGGQGPPGLSGPPG I KGE KGFPG 

FPGI,DMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 

SKGBMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 

P3LKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 

EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 

GPKG SVGGMGLPGTPGBKG VPGI PGPQGS PGLPGD KGAKG E KGQ 

AGPPGIGIPGLRGBKGDQGrAGFPGSPGEKGBKGSIGIPGMPGS 

PGLKGSPGSVG YPGSPGLPGEKGDKGLPGLDGI PGVKGEAGLPG 

TPGPTGPAGQKGEPGSDG I PGSAGEKGEPGLPGRG FPGFPGAKG 

D KG S KGE VG FPGLAGSPG I PGSKGEQGFMGP PGPQGQPGL PGS P 

GHATEG P KGDRGPQGQPGL PGLPGPMG? PGLPG I DGVKGDKGNP 

GWPGAPGVPGPKGDPGFQGMPGIGGSPGITGSKGDKGPPGVPGF 

QGPKGLPGU3G IKGDQGDQGVPGAKGLPGPPGPPGPYDI IKGEP 

GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 

PGQKGBMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVEHGFL 

VTRHSQTIDDPQCPSGTKILYHGY3LLYVQGNERAHGQDLGTAG 

SCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 

ITGFNIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWr 

GYSFVMHTSAGAEGSGQAIASPGSCLEEFRSAPFIECHGRGTCN 

YYANAYSFWLATIBRSEMFKKPTPSTLKAGELRTHVSRCQVCMR 

RT j 




1192 


61 


STRSPTVEYISAHPHrLFMLLKGYEAPQIALRCXSIMLRECIRHE 
PLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEQNYDTI FEDYEKLLQSSNYVTKRQSLKLLGELILDRHN 
FAIMTKYXSKPENLKI^IMNLI^DKSPNIQFEAFHVFKVFVASPH 
KTQ P I VE 1 LLKNQPKL I BFLS S FQKERTDDEQFADEKWYLI KQI 
RDLKKTAP+RALRDSKR 


579^— 


2 


1078 


GRVGWE^WCMYISPPKDVWDAGDPSLPIRTPAMIGCSFVVNRKF 
FGEIGLIiDPGMDVYGGBNIELG IKVWLOGGSMEVLPCSRVAHI E 
RKKJCPYNSWIGFYTKRNALRVAEVWMDDYKSHVYIAWNLPLENP 
G1DIGDVSERRALRKSLKCKNFQWYLDHVYPEMRRYNNTVAYGE 
LRNNKAKDVCLDQG PLEMHTA ILYP CHGWG PQLAR YTREG FLHL 
GALGTTT1»I»PDTRCLVDNS KSRLPQLLDCDKVXSSL YKRWNF1Q 
NGAIMNKGTGRCLEVENRGLAGIDLIIjRSCTGQRWTIKN3IK*R 

EGAGALEPGPQDMAAPPNIWTSCPGGETARGRQVLDGPPRASPG 
QHRDPG 


5797 


2 


891 


PRVRQKTLVDVTLENSNIKDQIRNLQQTYEASMDKLkEKORQtiE ~ 
VAQVENQLLKMKVESSQEANAEVMREMTKKLYSQyEEKLQEEQR 
KHSAEKEALLEETNS FLKAIEEANKKMQAAEISLEEKDQRIGEL 
DRL XERMB KERHQLQLQtiLEHETEMSG EIjTDSDKERYQQLEEAS 
AS LRERIRHLNDMVHCQQKKVKQMVB3 IES LKKKLQQKQLL ILQ 
LLEKI S FLEGENNELQSRLDYLTETQAKTBVETREIGVGCDLLP 
SQTGRTREIVMPSRNYTPYTRVLELTMKKTLT 


5798 ! 
"5799 


644 


115 


KILGSRWKSNSNQEKQPYYEEQARLSKIHLBKYPNYKYKPRPKlT" 
TCIVDGKKLRIG3YKQLMRSRRQEMRQFFTVGQQPQIP1TTGTG 
WYPGAI TMATTTPS PQMTS DCSSTSAS PE PSLP VIQST YGMKT 
DGGSLAGNEMINGEDEMEMYDDYEDDPKSDYS5ENEAPEAVSAN 




2^79 


1435 

] 
i 


LLSTY I KFINLFPETKAT I QGVLRAGSQLRNAD VELQQRAVB YL "~ 
TLSSVASTDVIATVLEEMPPFPERESSILAKLKRKKGPGAGSAL 
DOjRRDPSSNDINGGMEPTPSTVSTPSPSADLLGIiRAAPPPAAP 
PASAGAGMiLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPP IP 
EADBLI^KFVCXNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTS VQFQMFS PTVVHPGDLQTQI*AVQTKR VAAQVDGGAQ VQQVI, 
1 1 BCLRD FLTPP LLSVRFR YGG APQALTLKL PVTI NKFFQPTEM 
\AQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 
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SEQ 
ID 
[ HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eotide 
location 
corr a sponding 
to firot 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
|A=Alanine, C=Cysteine, DoAspartic Acid", E= 
Clutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I»Isoleucine, K«Lysine, 
LaLeucinc, MoMethionine, N»Asparagine, 
P=»Proline, Q=Glucamine, R«Axginine, 
S«Serine, T=Threonine, V=Valine, 
WaTryptophan, Y» Tyrosine, X« Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\apoasible nucleotide insertion) 








LLDNVD PNP ENFVGAG 1 1 QTKALQVGCLLRLE PNAQAQM YRLTL~' 
RTSKBFVSRHLCELLAQQF 


5800 


2673 


1435 


LLS T Y I KP INL PPBTKATI QGVLRAGSQLRNAD VELQQRAVE YL 
TLS S V7ASTDVLATVLESMPP FPERESS I LAKIiKRKKG PGAGS AL 
DDGRRDPSSNDINGGM5PTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPBEAFLSPGPEDIGPPIP 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLG3RMY1»FYGN 
KTS VQFQNFS PTVVH PGDLQTQLAVQTKR VAAQVDGGAQVQQVIi 
NIBCLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQBAQKI FKANH PMDAE VTKAKLLGFGSA 
LLDNVDPNPENFVGAG I I QTKALQVGCLLRLE PNAQAQMYRLTL 
RTS KEPVS RHLCELLAQQF 


5801 




1413 


FPRLYHLI PDGE ITS xKINRVDPSESLS I RLVGGdBT PLVHI 1 1 
QHI YRDG V IARDGR L LPGD 1 1 LKVNGMD ISNVPHN YAVRLLRQP 
CQVLWLTVMREQKFRSRNNGQAPDAYRPRDDS FHVILNKS S PEE 
QLGTKLVRKVDEPGVFI FNVLDGGVAYRHGQLEENDRVLAINGH 
DLRYGS PBS AAHL I QAS B RRVHLWSRQVRQR SPDI FQEAGWNS 
NGSWSPGPGERSNTPKPLHPT1TCHEKVVMIQKDPGESLGMTVA 
GGASHRBWDLP IYVISVE PGG VI SRDGRIKTGDI LLNVDGVELT 
BVSRSEAVALLKRTSSS I VL KALEVKEYE PQEDCSS P AALDSNH 
NMAPPSDWSPSWVMWLBLPRCLYWCKDrVLRRNTAGSLGFCIVG 
GYEEYNGNKPFPIKSIVEGTPAYNDGRIRCGDlXLAVNGRSTSG 
M IHACLARLLKELKGRI TLTI VSWPGTFL 


5802 


3 


230 


CFS LYQ I MERI MDLPTLLRHAFREM FSVGGLFWMFRI RI I LCLM 
GAFFYLI S PLD FVP EALFGI LGFLDDFFVI FLLL I Y I S I M YRE V 
ITQRLTR 


5803 


2234 


1299 


BAQFGTTAEIYAYREEQDFGlBIVKVKAIGRQRFICVLELRTQSD 
GIQQAKVQILP2CVL PSTMSAVQLESLNKCQI FPSKP VSREDQC 
S YKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRI KKQLRE 
W DENLKDDS L P SNP I DFS YRVAACLPIDDVLR IQLLKI GS AIQR 
LRCB LD I MNKCT3 LC CKQCQETE I TTKNB IPS LS LCG PMAAYVN 
PHGYVHE^LTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICA 
SHIGWKFTATKXDXS PQKFWGLTRSALIiPTI PDTEDE IS PDKVI 
LCL • 


5B04 


2 


1707 


EMEKQRQEEQRKRTEEERKRRIEQDMLEiOUaQRELAKRAEQIE 
D INNTGTES ASEEGDDS LLITWP VKS YXTSGKMKKNFEDLE KE 
REBKERIKYEEDKRIRYEBQRPSLKEAKCLSLVMDDEIESBAKK 
ESLSPGKXKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 
ARRCMVNEDEEMQDTAKI FKGYRPGKLKLSFEEMERQRREDEKR 
l^EEFiARRRI EEB KKAFAEARRNMVVDDDS PEMYKTISQEFLTP 
GKLEINFEELLKQKMEEEKRRTBEERKHKLEMEKQBFEQLRQEM 
GEE E EE N ET FGLS RB YE ELI KLKRSGS I QAKNLKS KFE KIGQLS 
EKEIQKKIEEERARRRAIDLEIKEREAEIJFHEEDDVDVRPARKS 
EAP FTHKVNMKAR FE QMAKAREEEEQRRIBEQKLLRMQFEQREI 
DAALQKKREEEEEEEGS IMNGSTAEDEEQTRSGAPWFKKPLKNT 
SWDSE PVRFTVKVTGEPKPBITWWFEGEILQDGEDYQYIERGB 
TYCLY L PETF P E DGG E YM CKA VNNKGS AAS TC I LT I ES KN 


5805 


3 


776 


Y 3SDTLGQVYKS KIRW W I EENGGNGN IS VDDL I ALLDLJVEHAS S 
AFKESQQQSBDREYE VKERLYPKS KRRYDTYNIAGYQGEIEVGL 
YT I Q I LQLI P FFDNKNELS KRYMVUF VSGS SDI PGDPNNB YKLA 
ajivw XxriUi rjjKb i>h K.KS r DeFDa X F VI*uKPRNNI KQNEEAKTR 
RIWAGYFKKYVDIFCLLEESQNNTGLG3KFSEPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFLIiK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWLRSLDKPIHVFFGAAILSLSIASVISGINEKLFFSLKNTT 
RPYHS L PS EAVFANSTGMLWAFGLLVLYXLLAS S WKRP 


5807 


2257 


1302 


RFS KKT FRRPMAVDIQP ACLGLYCGKTLLFKNGSTE I YGECGVC 
PRGQRTNAQKYCQPCTESPELYDWLYLGFMAMLPLVLHWFPIEW 
YSGKKSSSALFQHITALFECSMAAI I TLLVSDPVGVLYIRS CRV 
LMLSDWYTML YNPS P0YVTTVHCTHEAVYPLYTI VF I YYAFCLV | 
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oc>U 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H«Histidine, I=Isoleucine, K»Lysine,* 
L=Leucine, M=Methionine, N-^Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S -Serine, T«Threonine, V= Valine, 
W=Trypt ophan, Y=Tyxosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=poasible nucleotide insertion) 








LKMLLRP LLVKKIACGLGKSJDRFKS I YAALYFFPILTVLQAVGG 
GLLYYAFPYI I LVLS LVTLAVYMS ASB I EKCYDLLVR KKRLI VL 

FSHWLLHAYGIISISRVDKLEQDLPLLALVPTPALFyLFTAKFT 
EPSRILSEGANGH 


5808 


2 


433 


SLPDSGVVBYIiSNGGVADNHKDFGEIiRYNECI»KNFSCNGkNGSS 

EGRITHGFOLKSAYEI^LMPYTNYTFDPKGVIDYIFYSKTHMNV 

LGVLGPLDPQWLVENWITGCPHPHIPSDHFSuLTQLELHPPLLP 
LVNGVHLPNRR 


5809 


464 


2422 


ILVPGFQG I IjHPGVYCAIjQSQHQAQELVADIDECE VSGljCRKGG 
RCVNTHGS FECYCMDGYLPRNG PEPFHPTTDATSCTE IDCGTPP 
EVPDGYIIGNYTSSWSQVRYACREGFFSVPEDTVSSCTGLGTW 
ESPKLHCQEINCGNPPEMRHAILVGNHSSRLGGVARYVCQBGFE 
SPGGKITSVCTEKGTWRESTLTCTEILTKINDVSLFNI>TCVRWQ 
INSRRINPKISYVISIKGQRLDPMESVREETVMLTTDSRTPEVC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 
SIFNETCLKLNRRSRKVGSEHMYQFTVLGQRWYLANFSHATSFN 
FTTREQVPWCLDLYPTTDYTVNVTLLRS P KRHS VQ IT IATP PA 
VKQnSNISGFNETCLRWRSIKTADMEEMYLFHIWGQRWYQKEF 
AQEMTFNISSSSRDPEVCLDLRPGTNYNVSLRAIiSSBLPWISI, 
TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKNGPISSYQVLV 
LPLALQSTFSCDSEGASSFFSNASDADGYVAAELLAKDVPDDAM 
BIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDSSLMLLQMAGVGLGSLAWIILTFLSFSAV 


5810 


3 


1641 


KVFGTHKDHEVSTLDTAI S AVKVQLAEFLENLQEKSfcRI EAFVS 
B I ES FFNT 1 EE^CS KNEKRU3EQNEEMMK KVLAQ YDE KAQSFEE 
VKKKKMEFLHEQMVHFLQSMDTAKDTL^TIVREAEELDEAVTXT 
S FEEINERIiLSAMESTAStiEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQP PRLEPQEPNS ATSTTIAVY WS MNKEDVTDS FQVYCME 
E PQDDQEVNELVEEYRJbTVKES YCI FEDLEPDRCYQVWVMAVNF 
TGCSLPSERAIFRTAPSTPVIRAEDCTVCWNTATIRWRPTTPEA 
TETYTLEYCRQHSPEGEGIiRS FSGI KGLQLKVNLQPNDNYFFYV 
RAIMAFGTSEQSEAALISTRGTRFLLLRETAHPAUIISSSGTV1 
SFGERRRLTEIPSVLGBELPSCGQHYWETTVTDCPAYRLGICSS 
SAVQAGALGQGE TSW YMHCS E PQRYTFFYSG I VS D VHVTER PAR 
VGILLD YNNQRL I FINAES EQLLFI I RHRFNEG VH PAFALE KPG 
KCTLHLGIBP PDSVRHK 


5811 


1318 


851 


AAA1JU5PLPEDKWSABKRRPLKSSLGYEITFSLLNPDPKSHDVY 
WDIEGAVRRYVQPFLNALGAAGNFS VDSQ I LYYAMLG VNPRFDS 
ASSSYYLDMHSLPHVINPVESRI^SSAASLYPViaJFIiLYVPELA 
HSPLYIQDKDGAPVATNAFHSPRWaGIMVYNVDSKTYNASVLPV 
RVEVDMVRVMEVFLAQLRLLFGIAQPQLPPKCLLSGPTSEGLMT 
WELDR LLWARS VENLATATTTLTS LAQL LGKI SN I VI KDDVA3 E 
VYKAVAAVQKSAEBLASGHLASAFVASQEAVTSSEIAFFDPSLL 
HLLY FP DDQ K FAI YI PLF LPMA VP I LLS LVKI FLETRKS WRKPE 
KTD 


5812 


5204 


2744 


GGRQRCQRGRSCGAREBEVEPGTARPPPAASAMDASLEXIADPT"" 
LARMGKNLKEAVKKLEDSQRRTEEENGKKLdSGDI PGPLQGSGQ 
DMVS ILQIjVQNLMHGDBDEEPQS PRI QNIGBCGHMALLGHSIjGA 
YISTLDKEKLRKLTTRILSDTrLWLCRIFRYENGCAYFHKEERE 
GLAKI CRLAIHS RYEDFWDGFNVLYNKKPVTYLSAAARPGLGQ 

YLCNQLGLPFPCLCRVPCNTVFGSQHQMDVAFLEKLIKDDIERG 
RLPLLLVANAGTAAVBHTDKTGRT.ITPT PcnvnTut mrcnmrr * m 

LALG YVS S SVLAAAKCDS^fTMTPGP WLGLPA VPAVTL YKEDDPA 
LTLVAGLTSNKPTDKLRALPLWLSLQYLGLDGFVERIKHACQIiS 
QRLQESLKKVNYIKILVEDELSSPVVVFRFFQELPGSDPVFKAV 
PVPKMTPSGVGRERHS CDALNR WLG EQL KQLVPASGLTVMDLEA 
EGTCLRFSPIiMTAAVLGTRGEDVDQLVAC I ES KLP VLCCTLQLR 
EEFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
3 EUT HAGLLKKLNE LE SDLT F KI GP EYKS MKSCL YVGMASDNVH 
AAELVETIAATARE I EDNSRLLENN5TEWRKGIQEAQVELQKAS 
BERLLEEGVLRQI P VVGS VLNWFSPVQALQKGRTFNLTAGSI.ES 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
se<juence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptiTJe"" 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F ^Phenylalanine, G=Glycine, 
n=nj.bLiuine , -L^isoJLeucine, K= Lysine, 
L^Leucine, M=Methionine, N^Asparagine, 
P=Proline, QaGlutamine, R-Argxnine, 
SaSerine, T«Threonine, V* Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








. TEPIYVYKAQGAGVTLPPTPSGSRTKQRLPGQKPFKRSLRG§DA^ 
LSrrSSVSHIBDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 
TDQTBAFQKGVPH PEDDHSQ VEGPESLR 


5813 


2936 


699 


HRDGVSGSLERPLTDRSRTGAFAQQRGKMATAGGGSGADPGSRG 
Ll^LLSFCVLLAGLCRGNSVERKIYIPLNKTAPCVRLLNATKOI 
GCQS S I S GDTGVIHWE KE EDLQWVLTDGPNP PYMVLLES KHFT 
RDLKEKLKGRTS RI AGLAVSLT KPS PASGFS PSVQCPNDGFGVY 
SNSYGPEPAHCREIQWNSLGNGLAYEDFSFPIFLLEDENETKVI 
KQCYQDHNI^QNGSAPTFPLCAMQLFSHMAWLSFSTAT\CMRRS 
SIQSTFSINPKIVO>PLSDYNVWSMLKPIfclTTGTLKPDDR\AA/A 

atrldsrsffwnv\apgaesavasfvtqlaaaealqkapdvttl 
prnvmfvffqgetfd yigssrmvydmekgkfpvqlenvds fveh 
gqyalrtslblwmhtdpvsqknesvrnqvbdllatleksgagvp 
avilrrpnqsoplppsslqrflrarnisgwladhsgafhnkyy 
qs i ydtaewinvs ype wle plke / et wn fg * qdtakaladvatv 

LGRAIjYELAGGTNFSDTVQADPQTVTRLLYGXFLIKANNSWFQS 

ilqgrdlrsylg*rglfqh\yiav\ssi>tntiyv/vlqyalanl 

TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 

rcvrstarlaralspafelsqwssteystwtesrwkdirarifl 
iaskelelitltvgfgilifslivtycinakadvlfiaprepga 

VSY 


5814 


8500 


432 

) 


ALKCRPRRVLAILVGPVQPDRMAEEGAVAVCVRVRPLNSREBSL 
GETAQVYWKTHNNVI YPVDGS KSFNFDRVLHGNETPKNVYEA\ I 

aapitdsaiqgyngtifa\ygqt\asgxtytmmgsbdhlgvipq 

GQ FHGHFSQKI * EVFLDREFLLRVS YMEI YNBTITDLLCGTQKM 
KPLI I RSDVNRNV YVADLTEE WYTS E MALKWI TKGEKS RH YGE 
TKMNQRS SRS HT I FRM I LES REKGEP SNCEGS VKVSHLNLVDLA 
GSERAAQTGAAGVRLKEGCWINRSLFIljGQVIKKLSDGQVGGFI 
NYRDSKLTRILQNSLGGNPKTRIICTITPVSFDETLTALQFAST 
AK YMKNT P YVNEVS TDEALL KR YRKE IMDLKKQLEEVSLETRAQ 
AMEKDQLAQIiLEE FCDLLQKVQNEKI ENLTRML VTS SSLTLQQ3L 
KAKRKRR VTWCLGKINKMKNSNYADQFN I PTNI TTKTHKXS IML 
LREI DE SVCSESDVFSNTLDTLS EI EWNPATKLLNQEN I ES ELN 
SLPJuOYDNLVLDYEQLRTEKEEMELKLKBKNDLDEFEALERKTfC 
KDQEMQLIHEIStniKNLVKHREVYNQDLBNELSSKVELLREKED 
QIKKLQEYIDSQKLENIKMDLSYSLESIEDPKQMKQTLFDAETV 
ALDAKRESAFLRS ENLELKEKMKELATTYKQMBNDI QLYQSQLE 
AKKKMQVDtiEKE LQS AFNE ITKtiTSL I DGKVPKDLLCNLELEGK 
ITDLQKELNKKVBENEALREE VILLSELKSLPSEVERLRKE IQD 
KSEELHI ITSEKDKLFSBWHKESRVQGLLEEIGKTKDDLATTQ 
SNYKSTDQEFQNFKTLHMDFEQKYKMVLEENERMNQEIVNLSKE 
AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQLKEQLE 
NRDSPLQTVEREKTLITEKLQQTLEEVKTLTQEKDDLKQLQESL 
QIBRDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETINTLKS 
KI S EE VSRNLHMEENTG ETKDEFQQKMVGI DKKQD L EAKNTQTL 

TftDVKDNEIIBQQRKIFSLlQEKWELCXiMLESVIAEKEQLKTDL 
KBNIEMTIENQEELRLLGDELKKQQEIVAQEKNHAIKKEGELSR 
TCDRIAEVEEKLKBKSQQLQEKQQQLLNVQEEMSEMQKKINBIE 
NLKNELKNKELTLEHMETERLEIAQKLNENYEEVKS ITXERKVL 
KELQKSPETERDHLRGYIREIEATGLQTKEELKIAHIHLKEHQE 
TIDELRRSVSEKTAQIINTQDLEKSHTKLQEEIPVLHEEQELLP 
NVKKVSBTQETMNELELLrEQSTTKDSTTLAR IEMERLRLNEKF 
QESQEEIKSLTfOSRDNLKTIKEALEVKKDQLKEHIRETLAKIQE 
SQSKQEQSLNMKEKDNETTKI VSEMEQFKPKDSALLR IEI EMLG 
LSKRLQESHDEMKSVAKBKDDLQRLQBVLQS BSDQLKENI KE I V 
AKHLETE EELKVAHCCLKEQEB T INELR VNLS EKETE 1ST IQKQ 
LEAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNELKQFKEKR 
KAKDSALQSIESKMLBLTNRLQESQEEIQIMIKEKEEMKRVQEA 
LQIERDQLKENTKEIVAKMKESQEKEYQFLKMTAVNETQEKMCE 
IEHLKEQFETQ KLNL EN I ETEN IRLTQ ILHENLEEMRS VTKERD 
DIASVEBTLKVEPJX)LKENLRETITRDLEKQEELKIVHMHIJGEH 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, Cs=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenyl alanine, G=*Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
LoLeucine, MsMeth.i'onine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T«Threonine t V»Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








QEriDlCLRGIVSEKTMBISNMQKDLEHSNDALKAQDLk!! 1 ^^!^ 
I AHMHLKEQQETI DKiiHG I VS E KTDKLSNMQKDLENSNAKLCjE X 
IQELKANEHQLITLKKDVKETQKKVSEMEQLKKQIKDQSLTLSK 
LEI ENLNLAQ KLH ENL3EMKS VM KERDNLRRVEETLKLERDQLK 
ESLQETKARDLEIQQELKTARMI^KEEKETVDKLREKtSEKTIQ 
ISDIQKDLDKSKDELQKKIQBLQKKELQLLRVKEDVNMSHKKIN 
EMEQLKKQF EPNYLCKCEMDNFQLTKKLHE SI»EEI R I VAKERDE 
IiRR IKE S LKMERDQFI ATLREM I ARDRQNHQ VKPE KRLLSDGQQ 
HLMESLREKCSRIKELIiKRYSEMDDHVECLNRLSLOLEKBIEFH 
R I MKKLKYVLS YVTKI KEEQH E C INKFEMDF I DEVB KQKELL I K 
IQHLQQDCDVPSRELRDLKLNQNMDLH IEE ILKDFSESEFPSIK 
TEFQQVLSNRKEMTQPLEEHLNTRFDIEKLKNG1QKENDRICQV 
NNFFNNRIIAIMNESTEFEERSATISKEWEQDLKSLKEKNEKLF 
KNYQTLKTSIiASGAQVNPTTQDNKNPHVTSRATQLTTEKIRELB 
NSLHEAKESAMHKESKI IKMQKELEVTNDI IAKLQAKVHESNKC 
LEKTKETIQVLQDKVALGAKPYKEBIEDLKMKLGKIDLBKMKNA 
KEFEKEISATKATVBYQKEVIRLLRENLRRSQQAQDTSVrSEHT 
DPQPSNKPLTCGGGSGIVQNTKALILKSEHIRLEKEISKLKQQN 
BQhJ KQKNELLSNNQHLSNE VKTWKERTL KREAHRQVTCENSP K 
SPKVTGTASKKKQITPSQCKERNLQDPVPKESPKSCFFDSRSKS 
LPSPHP VRYFBNS SLGLCPEVQNAGAESVDSQP \GPWARLFQGK 
DVP\ECKTQ 


5315 


23 


1460 


SELVMWTVQNRES IiGLLSF PVM ITMVCCAHSTNEPSNMS Y VKET 
VDRLLKGYD IRIiRPDFGG P PVDVGMRI DVAS IDMVSEVNMD YTL 
TMYFQQSWi03KRLSYSGIPI^TLDNRVADQLWVPDTYFLNDKK 
S FVHGVTVKNRMI RLHPDGTVL YGLR I TTTAACMMDLRRYP LDE 
CNC7LEIES YGYTTDDI EFYWNGGEGAVTGVNKIELPQFS I VDY 
KMVSKKVBFTTGAYPRLSLSFRLKRNIGYFILQTYMPSTIiITIL 
SWV S FWINYDASAARVALG I TTVLTMTTISTHLRETIiPKI P YVK 
AIDIYLMGCFVBVFLALLEYAFVNYIFFGKGPQKKGASKQDQSA 
NEKNKLEMNKVQVDAHGNILLSTLE I RNETSGSE VLTSVSDPKA 
TMYSYDSAS IQYRKPLSSRE\A*GRAPDRHG VPSKGR IRRRAS\ 
QLKVKI PDLTDVNS IDKWS RM FFP I T FSLFNWYWIj YYVH 


5816 


861 


191 


TSSRSRAAAQEGDAETPGSVERRGRRAGABDGMSQAPGAQPSPP 
TVYHERQRLELCAVHALNNVLQQQLFSQEAADEICKRLAPDSRIi 
NPHRSLliGTGNYD VNVIMAAIiQGI/3LAAVWWDRRR PLSQLALPQ 
VLGLILNLPSPVSLGLLSLPLRRRHIjRWPCARL/VTVSYYNLDS 
K\LRAPEGPGGLRTE\*G PFLAAALAQGL CEVLL WTKE VEE KG 
SWLRTD 


5817 


851 


118 


KLKKGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD 
VMSNTTVPNAPQANSDSMVp YVLGPFFLI TLVGVVVAVVMYVQK 
KKRVDRLRHH LLPMYS YDPAEELHEAEQE LLSDMGDP KW\ QAG 
RVATSTSGCHCWMSRRDLTPLPHPSBPGVLDCLGPCHLIiPLLSP 
GSPCWVLGLHFSLHPPSAASASHALTITSLPPGLLPFVGVBLrA 
HPQALMGRGFPSGMAAAGRHLCFL 


5818 


3 


3918 


QALR DKL W IFLVQS FYAVRHTES WKLMSTDDQQKI QAAAFDKGD "~ 
DRRLGKKP I FS5 S QQRKQ VSDSGDI KI KS WRGNNKKE CWS YLST 
NKKMKSDGLGASGHSSSTNRNSINKTLKQDDVKEKDGTKIASKI 
TKELKTGGXNVS GKPKTVTKS KTENGDKARLENMS PRQWERSA 
TAAAAATGQKNLLNGKGVRNQEGQISGARPKVLTGNLNVQAKAK 
PIiKKATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKENGS 

NGTSNK KS I HEQDTNVNNS VLKKVSGKGCS EPVP QAILKKRGTS 
HGCTAAQQRTKSTP SNLTKTQGS QG ES PNS VKSS VSSRQSDENV 
AKLDHNTTTEKQAP KRKMVKQVHTALPKVNAKI VAMPKNLNQSK 
KGETLNNKDSKQKMPPGQVISKTQPSSQRPLKHETSTVQKSMFH 
DVRDNKNKDSVSEQKPHKPL1NLASEISDAEAIiQSSCRP\DPQK 
PLNDQEKEKLALECQNI SKLDKSLKHELESKQI CLDKSETKFPN 
HKE TDD CDAAN I CCHS VG S DNVNS KFYS TTAL KYMVSNPNENS L 
NSNP VCDLDS TS AGQI HL ISDRENQVGR KDTNXQSS I KCV3DVS 
L CNPERTNG TLNSAQED KKSKVPVEGLTI PS KI»SDES AMDEDKH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end - 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Amino acid segment containino aicmai SS>PTas — i 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M»Methionine, N^Asparagine, 
P-Proline, Q=Glutair.ine, R=Arginine, 
S=Serine, T=Threonine, V=»Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=>Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDV&ttKUb'SGQLSEKNSPKNr^SESPBSHBTPBTPFVGH" 

WWLSTGVLHQRBSPESDTGSATTSSDD1KPRSEDYDAGGSQDDD 

GSNDRGISKCGTMLCHDFLGRSSSDrSTPEELKlYDSNIiRIBVK 

M K KQS SNDLFQ VNSTS DDE IPRKRPEIWS RSAI VHS REREN I PR 

GS VQFAQE I DQVSSSADETEDERSEAENVAENFS I SNPAPQQFQ 

GIINtiAFEDATENECREFSANKKFKRSVLLSVDECEEUSSDEGE 

VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 

CKQNKGNS VCXNESTVLDLSS IDS SRKNKQSVSATEKKNTIDVL 

SSRS RQLLREDKKVNNGSNVENDI QQRSKFLDSDVKSQERFCHL 

DLHQRBPNSDIPKNSSTKSLDSFRSQVLPQEGPVKESHSTTTEK 

ANIALSAGDIDDCDTLAQTRMYDHRPSKTLSPIYEMDVIEAFEQ 

KVES3THVTDMDF*DDQHFAKQDWTIiLKQLLSEQDSWLDVTNSV 

PEDLSLAQYLINQTLLLARDSSBCPQGITHIDTLNRWSBLTSPLD 

SSASITMASFSSBDCSPQGEWTILELETQH 1 


5819 


l 


5S57 


AAAGLLGA'bilLvMTLWAAARAEKEAFVQSESI IEVLRFDDGGtT" 
LQTETTLGLSSYQQKSISZjYRGNCRPIRFEPPMLDFHEQPVOMP 
KME KVYLHN PS S E *TI TLVSI FATTSHFHASFFQNRK I L PGGNT 

sfdvs/vfiarwgnventlfintsnhgvfty\qvfgvgvpnpy 

RLRPFLGARVTVNSSFSPIINIHNPHSEPLQWEMYSSGGDLHL 
ELPTGQQGGTRKLWEIPPYETKGVMRASFSSREADNHTAFIRIK 
TNASDSTEFIILPVEVEVTTAPGIYSSTEMLDFGTLRTQDLPKV 
LNLHLLNSGTKDVPITSVRPTPQ\NDAITVHFKPITLKAS\ESK 
YTKVAS IS PDAS KAK KPSQ FSGK I TVKAKE KS YSKLE IP YQAEV 
L DG YLGFDHAATLFH I RDSP AD PVERP I YLTNTFSFAI LI HDVL 
LPEEAKTMFKVHNFS KPVL I LPNE SG Y I FTLLFMPSTSS MHI DN 
N I LL I TNAS KFHLP VRVYTGFLDY FVL PPKI EERF IDFGVLSAT 
EASNILFAIINSNPIELAIKSWHIIGDG\L3IELVAVDRGNRTT 

iisslpecekssssdqssvtlasgyf\avfrvkltakkl\egih 
dga i q ittdyeilt i p vk\ aviavgsltcs p khwlpp s fpgki 
vhqslnimnsfsqecvkiqqirslsbdvrfyykrlrgnkedlepg 
kkskianiyfdpglqcgdhcyvglpflsksepkvqpgvamqedm 
wdadwdlhqslfkgwtgi keksghrlsaifevntdlqkniiski 

TAB LSW P S ILS S PRHLKFPLTNTNCS S \ EEE ITLENP /SQDVP V 

wqfiplalysnpsvfvdklvsrfnlskvakidlrtlefqvfrn 

SAHPIiQSSTGFMEG\l*SPllLIIiNLILKPGEKKSVKVK\FTPVHN 
RTVSSLI IVRNNLTVMDAVMVQGQGTTENLRVAGKLPGPGSSLR 

fkiteallkdctdslklrepnftlkrtfkventgqlqihietie 
isgyscegygfkwncqeftlsanasrdiiilftpdftasrvir 

ELKFITTSGSEFWILIO^LPYHMLATCAEALPRPNWEIALYII 

isgimsaijxlvigta\yleaqgiwbp\frrrls\feasnppfd 

vgrpfdlrrivgissegnlntlscdpghsrgfggaggsssrpsa 

gshkq*gpsghphsshsnrnsadvddvraynsgrtssmtsaqaa 

S SQ PANKTRPLVLDSNTGAQGHS agrks kgakqsqhgsqhhahs 

pleqhpqpplpppvpqpqepqperlspaplahpshperassarh 

ssedsditslieamdkdfdhhdspalbvfteqppsplpkskgkg 

kplcrkvkppkkqeekektokgkpqedelkdsladddssstttb 

tsnpdtepllkedtekqkgkqampekkesemsqvkqkskkllnx 

kkbiptdvkpsslelpytpplesxqrrklpskiplptamtsgsk . 

srnaqktkgtsklvdnrppalakflpnsqelgntsssegekdsp 

ppetosvpvhkpgsstdslyklslotjlnadiflkqrqtsptpas 

PSPPAAPCPFVARGSYSSIVWSSSSSDPKIKQPNGSKHKLTKAA 
SLPGKNGNPTFAAVTAG YDKS PGGNGFAKVSSNKTGFSSS LGI S 
HAPVDSDGSDSSGLWSPVSNPSSPDFTPLWSFSAFGNSFNLTGE 
VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 
SGSPTHTATSVLGNTSGLWSTTPF3SSIWSSNLSSALPFTTPAN 
TLASIGLMGTEN3PAPHAPSTSSPADDLGQTYNPWRIWSPTIGR 
RSSDPHSNSHFPHEN 


5820 


310 


1270 


RVSLSGPVSLGVIjLCARSSTMGKRDNRVAYMWPIAMMSRGPIQ 

ssgptiq\vi*idc^lpgkk*ksn*krkrk/dskalaefeekmn 

EJWKiG3LEKHREKLLSGSBSSSKKRQRKKKEKKKSW+\DSSSS\ 
SSSSDSSSSSSDSEDEDKK0<3KRRKKKKNRSHKSSESSMSETES 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"' 
(A=Alanine, C»Cysteine, D^Aspartic Acid, B» 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
K=Histidine, I»Isoleucine, KoLysine, 
L"iieucine ( M=Metnionane, N=Asparagine , 
P^Proline, Q»Glut amine, R=Axginine, 
S»Serine, T=Thxeonine, VoValine, 
WcTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«poasible nucleotide insertion) 








Ui»iU>:»L»klUUUU» K1X» XKKti 1 KJiiLiS KKKKMYS EDKPLSSESLS 
BSEyiEBVRAKKKKSSEEREKATBKTKKKKKHKKHSKKKKKKAA 
SSSPDS P *H* EKSGFP YKESAMSEE ESTVKTTTYLLKCMNFLVF 
GIIPGLFSSHSDATV 


5821 


L79 


915 


KWRNQSWRWPKPGTNWMJ^CSVCWJIRVTWTGSVWMRKLGKHPQT 
PT/ 1 KDCS I AATGKRPS ARPPHQRRKKRREMDDGLAEGG PQRSN 
TYVIICLFXIRSVDLAQFSENTPLYPICRAWMRNSPSVRERECSPS 
SPLPPLPEDEEG\SEVTNSKSR*CVQACPPTHT?GGQPKNACR\ 
SRI PSPLAALRMQGTP* RWSPFBPEPS PSTLI YRNMQRWKRIRQ 
RWKEASHRNQLRYSESMKI LREMYERO 


5822 


464 


4379 


QTLKEMPIVMARDLEETASSSEDEEVISQEDHPCIMWTGGdRRT^ 
PVLVFHADA I LTKDNN1 R V IG BR YHLS YKI VRTDS RLVRS I LTA 
HGPHEVKPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHFPRSYE 
LTRKDRLYKN1 1 RMQHTHGFKAFH I L PQTFLLPAE YABFCNS YS 
KDRGPWIVKPYASSRGRG\VYLINNPNQISLEENILVSRyiKNP 
LLIDDFXFDVRLYVLVTSYDPLVIYLYEEGLARFATVRYDQGAK 
NI RNQFMHLTN YSVNKKSGDY VSCDD P3VEDYGNKWSMS AMLRY 
LKQBG RDTTALMAHVEDL I 1 KTIISAEIiAIATACKTPVPHRSSC 
FELYGFDVLIDSTLKPKLLEVNLSPSLACDAPLDIjKIKASMISD 
MFTWGFVCQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 
S DAEM KNLVGSAREKGPGKLGGS VLGLSMEE X KVLRR VKE ENDR 
RGGFI R I FPTS ETWE I YGS YLEHKTS MNYMLATRLFQDRMTADG 
APELKI *SLNSKAXLHAALYERKLLSLEVRKRRRRSSRI>RAMRP 
KYPVI TQP AEMNVKTETES EEEEEVALDNEDEEQEAS QEBSAGP 
LRENQAKYTPSLTALVENTPKEWSMKVREWNKKGGHCCKLETQE 
LEPKFNLMQ1I,QDNGNLSKMQARIAFSAYLQHVQI\RLMKDSGG 
QTFS AS WAAKEDEQMEL WRFLKRASNNL QHSLRMVti PSRRLAL 
LERTR IliAHQLGDFI I VYNKETEQMAEKKSKKKVEEEEEDGVNM 
ENFQEF.T RQASEAELEEVLTFYTQKNKSAS VFLGTHS KISKNNN 
NYSDSGAKGDHPETIMEEVKIKPPKQQQTTEIKSDKLSRFTTSA 
EKEAKLVYSNSSSGPTATLQKIPNTHLSSVTTSDIiSPGPCHHSS 
LSQIPS AI PSMPHQPTI LLNTVSASAS PCLHPGAQNI PSPTGLP 
RCRSGSHT IGPFS S FQS AAH I YSQKLS R P SSAKAGS C YLNKHHS 
GIAKTQ KEGEDASLYS KRYNQSMVTAELQRLAEKQAARQYS PSS 
HINLLTQQVTNLNLATGI INRSSASAP PTLRPII SPSGPTWSTQ 
SDPQAPENHSSSPGSRSLQTGGFAWBGEVENNVYSQATGWPQH 
KYHPTAGSYQLQFALQQLEQQKLQSRQLLDQSRARHQAI FGSQT 
LPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 
VPKPP PNHEQ VLRRATSQKAS KGSSAEGQLNGLQSSLNPAAFVP 
ITSSTD PAHTKIMNHKHTEKQ PVHHSW VHD 


5823 


42 


2293 

I 


LLTALSHEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESP PAWS PLAGE KF VE VYKEAHLLALH I ESSSRNQAAQAAKP 
EDPRSQGVERFIQESKP\KINLFBKEKEMKKSPTSLKRETYYLS 
DS PLLGPPVGEPRLLAS S PALPS SGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAAS VRGRG1 PGAAEKPKK 
EIPASPSRTKIPABKESHRDVLPDKPAPOaVMVPAAGSHLGQGK 
RAI P VP\NKLGLKKTLLKAPGS YSN\ LQR KSS SGA\ VWSGAS SA 

ctpqpvakakssefasipan*lpglcpnisksvgrmgpamlrpa 

Jj\PAGPVG\ASSWOAIKVDVSELAAEQLTAPP\SASPTQPQTPE 

ggg\qwlnsscawses sqlnktrs 1 rrrdsclnsktkvmptptn 
qfki pkfs igds \pdsstpklsraqrpqsctsvgrvt vhstpvr 
rssgpapqsllsawrvsalptpasrrcsglppmtpktmpravgs 
pl\cvparrrsseprknsamrteptresnrktdsr\lvdvspdr 
gsppsrvpqalnfspeesdstfskstatbvareeakpggdaaps 
eallvdikleplavtpdaasqplidlplidfcdtpeakvavgsb 

SRPLIDU1TNTPDMNKNVAKPSPVVGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 


5824 


42 ! 


2293 


lLtalsi^ggggrdbpsacragdvnmddpkkedillladekfdf" 
dlslssssaneddevffgpfghkbrcxaaslelnnpvpeqpplp 

TSES PFAWS PLAGEKFVBVYKEAHLLALH I ESS SRNQAAQAAKP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CaCysteine, D-Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G=K3lycine, 
H«Histidine, I=Isoleucine, K-Lysine, 
I*«Leueine, (^Methionine, N«Asparagine, 
P=Proline, Q-Glutamine, R*=Arginine, 
SoSerine, T«Threonine , V=Valine, 
W=Tryptcphan, Y=Tyrosine, X=Un Known, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








B DPRSQG VfiRFIQES KF \K1 NLFEKE KEM KK£» PTS LKRETY Vl*S 
DSPLLGPPVGBPRLLASSPALPSSGAQARLTRAPGPPHSAHALP 
RES CTAHAAS QAATQRKPGTKLIj LPRAAS VRGRGI PGAAEKPKK 

BIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAI PVP\NKLGIiKKTLUCAPGS YSN\LQRXSSSGA\ VWSGASSA 
CTPQPVAKAKSSEFAS IPAN* LPGLCPNI SKS\GRMGPAMLRPA 
L \PAGFVG \ ASS WQAKR VDVSE LAAEQLTAP P \ SAS P TQPQTP E 
GGG\QM14ISSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PIi\CVPAJlRRSSEPRKNSAMRTE PTRBSNRKTDSR\ LVDVSPDR 
GSPPSRVPQALNPSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDIKLEPLAVTPDAASQPUDLPLIDFCCrrPEAHVAVGSE 
SR PLI DLMTNTPDMNKNVAKPS PWGQL IDLS S PLI QLSPEADK 
ENVDSPLLKF 


5825 


2 


4210 


t'liQ X ES AS PAP FSS GFLAAHPHS PGGSLATKG RSRLSAPGMLHIi 
SAAPPAPPPEVTATARPCLCSVGRRGEGGKMAAAGAtiERSFVEL 
SGAERBRPRHFREFTVCS IGTANAVAGAVKYS ESAGGFYYVESG 
KLFS VTRNRFIHWKTSGDTLBIiMEES LDI NLLNNAIRLKFQNCS 
VLPGGVYVSETQNRVI ILMLTNQTVHRLLLPHPSRMYRSELVVD 
S QMQS I FTDI GXVD PTDPCNYQLI PAVPG 3 S PNSTASTAWLSSD 
GEALFALPCASGGIFVLKLPPYDIPGMVSWELKQSSVMQRLLT 
GWMPTA1RGDQSPSDRPLSLAVHCVEHDAFIFALCQDHKLRMWS 
YKEQMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGIjYL 
GIF\KHAPKRGQFCIFQLVSTESNRYSIiDHISSLFTSQETLlDF 
ALTSTDIWALVJHDAENQTWKYINFEHNVAGQWNPVFMQPLPEE 
EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 
DI>S WSEI*KJ<E VTLAVENBLQGS VTEYEFS QEB FRMLQQE FWCKF 
YACCLQYQEAI^HPLALHI^PHTNMVCr*LKKGYLSFI»I PSSLVD 
HLYLLPYBNLIiTEDETTI SDDVDIARDVI CLI KCLRLI EES VTV 
DMSVIMBMSCYNLQSPBKAAEQ1LEDMITIDVENVMEDICSKLQ 
EIRNPIHAIGLLIREMDYETEVEMEKGFNPAQPLMIRMNLTQLY 
GSNTAGYIVCRGVHEaASTRFLICRDLLILQQLLMRLGDAVIWG 
TGQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
WHLSVLELTDSGAWlANRi^SSPQTrVELFFQEVARKHIISHL 
FSQPKAPIiSQTGLirrfPEMITAITSYLLQLiWPSNPGCLFLECIjM 
GNCQYVQLQDYIQLLHPWCQVNVGSCRFMLGKCYLVTCEGQKAL 
EC FCQAAS EVG KEEFLDRLI RSEDGE I VST PRLQY YDKVLRLLD 
VTGLPEL VIQ LATSAI TEAS DDW\ KS QATL \ RTCI FKHHI»\ DLG 
\HNSQAYGSL * PQI PDSSRQLDCLRQLVWLCERSQLQDLVEFS 
YVNLHNEWGI IE S RARAVDLMTHNYYELLYAFHI YRHN YRKAG 
TVMFBYGMRLGREVRTLRGLEKDGWCYIiAALNCLRLIRPEYAWI 
VQP VSGAVYDRPGAS PKRNBDGECTAAPTNRQ I E ILELEDLEKE 
CSLAR IRLTLAQHD ?SAVAVAGSSSAEEMVTLLVQAGI,FDTA IS 
u^Ui riLUfLii FVt fcGIAFKCIKLQFGGEAAQAEAWAWLAANQLS 
S VITTKESSATDBAWRLLS TYIi ERYKVQNNLYHHCVINKLLSHG 
VPL P NWL INS YKKVDAABLLRL YLNYDLLDLTP YQ VIR ICGC 


5826 


3 


871 


KSQLLRDHSAPPPKPCTSVGAMGC*PRQ/SPKEQQRQLKKQKNR 

AMQRSRQKHTDKADACiHC^HESLEKDNIJ^RKEIQSLOAELAW 

WSRTLHVHERLCPMDCASCSAPGLLGCWDQAEGLLGPGPQGQHG ' 

CREQLELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 

AWAEPPVQLSPSPLLFASHTGSSLQGSSSiOSALQPSLTAQTA 

PPQPLBLBHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 

WQGLWDPSPHPIjLAFPIjLSSAQVHF 


5827 


194 


2287 


GMGSENSALKSYTLREPPFTLPSGLAVYPAVLQDGKFASVFVYK 
RENEDKVWKAAKVP* *HI^KTLRHPCIiLRFLSCTVEADGlHIjVTE 
RVQ P LEVALE TLS S AEVCAG I YD I LLAL I FLHDRGHLTHNN VCL 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQSIRDPASIPP 
EEMS PEFTTIj PECHGHARDAFS FGTLVBSLIiTILNEQ vs advls 
SFQQTLHSTLLNPIPKWRPALCTLLSHDFFRNDFLEWNFLKSL 
TLKSEEBKTEFFKFIXDRVSCXSEELIJISRLVPLLLNQLVFAEP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, E» 
Glutamic Acid, F=>Phenyl alanine, G=Glycine. 
H*Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=*Glutamine, R=Arginine, 
S«Serine, ^Threonine, V=Valine, 
W«-Tryptophan # Y-Tyrooine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV\KSFLP YLLKJPKKDHAQGETPCLIiS PAI >FQSRV~ PVLLQLF 
EVHEEMVRMVLLSHIBAWGAliSLREQLKKVXlLXPQVLLGSLR 
D\TSDS I VAITLHSLAVLVSLLGPEVWGGERTKI FKRTAP \S F 
TK\KTDLSLBGDPFSQPIKFPINGLSDVKNTSEDSBNFP5SSKK 
SBBWPDWSGPE\EPENQTWl\QrwP\REP\CDDVKSQCTTLDV 
BBS S WDDCEPS SLDTKVNPGGGI TATKP VTS GBQKP I PALTtSLT 
EESMPWKSSLPQKISIiVQRGDDADQIBPPKVSSQERPLKVPSEI* 
GLGEEFTIQVKKKPVKDPEMDWFADMIPEIKPSAAFLILPELRT 
EMVPKKDDVSPVMQFSS KFAAAE ITEGEAEGNEEBGELNWEDNN 
W 


5826 


2 


257 


AREGGSLGAVAACG ELS YSCD FCPARPHTS WLTRF VKM B PQAW 
MAVGG GSRMTDLTS S I P KPLL PVGNKFL I W YPLNLLER VG FEEV 
I WTTRDVQKALCAE FKMKMKPD I VC IPDDADMGTADS LR Yl Y P 
KLKTDVLVLS CDliI TDVALHBVVDLFRAYDASLAMIJ^IRKGQDS I 
EPVPGQKGKKKAVEQRDFIGVDSTGKRLLFMANBADLDEELVI K 
GSILQKHPRI RFHTGLVDAHLYCLKKYI VDFLMENG\S ITS IRS 
BL\IPYLV/RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRISY 
S FY* KEANYTGTGAP Y\D\ACWI 


5829 


266 


1259 


PDGRUVSCSEDICIIKIWDTTNKQCVNNFSDSVGFANFVDFNPS""' 
GTCIASAGSDQTVKVWDVRVKKLLQHYQVHSGGVNCISFHPSGN 
YLI TASSDGTLKI LDLLKGRLI YTLQGHTG PVFTVS FSKGGELF 
ASGGADTQ VL L WRTNFDE LH CKG LTKRNL KRLHFDS P P H LLD I Y 
PRTPHPHEEKVETVEDFFLHLLRL IQSLR* SI CRSLLPLLMISF 
LLI LPOaQKPWGLCQTRVKRPVD IS*TLP*CHQNVCQQPRKRK 
CKT+VTSPVKVK/VSIPLAVTDALEHIMEQLNVLTQTVSILEQR 
LTLTED KLKDCLSNQQ KL FS AVQ Q KS 


*83u" 


4496 


3139 


GGKMAAPEEJUDIiTQEQTEKLLQFQDLTGIESMDQCRHTIjEQHNW""' 

NIEAAVQDRLNEQEGVPSVFNPPPSRPLQVNTADHRIYSYWSR 

PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 

TDPVGDIVSFMHSFEEKYGRAHPVFYQGTYSQALNDAKRKLRFL 

LVYLHGDDHQDS DEFCRNTLCAP EVI S L INTRMLFWACSTNKPE 

GYRVSQALRENT YPFLAMIMLKDRRE * PV\ VGRLEGLI \QPDDL 

INQLTFIMDANQTYLVSERLBREERNQTQVLRQQQDEAYLASLR 

ADQEKERKKREERERKRRKXBEVQQQXLAEERRRQNLQBEKERK 

LECLPPEPSPDDPESVKIIFKLPNDSRVERRFHFSQSLTVIHDF 

LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTLQE\A 

GLSHTEVLFVQDLTDE 


5831 


71 


2897 


FCS KDKCCL YLPDS INRS KSCI'AKPGAHSQDRriAVMDS KQVKD 
TDDIESPKRSIRDSGYIDCWDSERSDSLSPPRHGRDDSFDSLDS 
FGSRSRQTPSPDWLRGSSDGRGSDSESDLPHRKLPDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATSPAGLGKKALQDYGPRT\PVS\DDABSTSMFDMRC3E 
E AAVQPHSRARQEQLQL 2 NNQLRREDD KWCDDLARWKSRKRS V5 
QDLI KKEBERKKME KliLAGEDGTSERRKSIKTYREI VQBKERRE 
RELHEAYKNARSQEEAEGILQQYIERFTISEAVLERLEMPKILE 
RSHSTEPNLSS FLNDPNPMKYLRQQSL PPPKFTATVETTI ARAS 
VI£^SMSAGSGSPSlOVTPXAVPMLTPKPYSQPKNSQDVLKTFK 
VDGKVS VNGET VHREBE KERECPTVAPAHSLTKS QMFEGVARVH 

gsplelkqdngsieinikkpnsvpqelaattbktepnsqedknd 
ggksrkgnielassepqhftttvtrcsptvafvefpsspqlknd 
vssekdqkkpenemsgkvelvlsqkwkpkspepeatltfpfld 
iwpeanqlhlpnlnsqwspssekspvttpfkfwawdpeeerrr 
qekwqqeqerllqeryq\keqdk\lkee\mekaqkeveeeerry 

YEEEP * 1 1 \ EDPWP FTVS S S SADQLSTS SSMTBGSGTMN KI DL 
GNOQDEKQDRR WKKS FQGDDSDLLLKTRE SDRLBEKGSLTEGAL 
AHSGNPVSKGVHEDHQtiDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSSEDVKPKTL PLDKS INHQI ES PS ERRXS I S GKKLCSS CGL 
PLGKGAAMI IETLNL YFHI Q CFRCG \ I CKGQLGDAVSGTDVRI R 
NGLLNCNDCYMRSRSAGQPTTIi 


5832 


2454 


829 


PGRRFRHGSCAFQKQCI MLH I CQ YFLQGECKFGTSCKRSHDFSN ' 
SENLEKLEKIiGMSSDLVSRLPTIYRIIAHDIKNKSSAPSRVPPLF 
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ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid - 
sequence 


Amino acid segment containing signal peptide"" 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E*» 
Glutamic Acid, Fa Phenylalanine, G=Glycine, 
H>Histidine, IolBoleucine, K=Lysine, \ 
L«Leucine, M=Methionine, l^Asparagine , 
P=>Proline, Qt=Glutamine, R=Arginine, 
S=Serine, Threonine, V»Valine, f 
W-Tryptophan, Yoiyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion! 1 








VPQGTSERKDS SGS VS PNTLS QEEGDQ ICI/YHI RKS CS FQDKC H J 
RVHFHLP Y K WQ FLDRGKWEDLDNMELI EE AYCNPKI BRILCSES 
ASTFHSHCI^FMAMTYGATQARRLSTASSVTKPPHFILTTDWIW 
YWSDEFGSWQEYGRQGTVHPVTTVSSSDVEKAYLAY/WYTGV*R 
PGSHLEVPGRKAQLRVRFQSliRSEKPGLWHN*KGLPQTQIR\AP 
QDVTTMQTCNTKFPGPKS I PDYWDSSAIiPDPGFQRITIiSSSSEE 
YQKVWNLFNRTLPFYFVQKIERVQNLALWEVYQWQKGQMQKQNG 
G KA VDERQLFHGTSA I FVDAI CQQNFDWRVCGVHGTS YG KGS YF 
ARDAA YSHH Y S KS DTQT HTM FLARVLVGB FVRGNAS FVR P PAKE 

GWSNAFYDSCVNSVSDPSIFVIFEKHQVYPEYVIQYTTSSXPSV 
TPSILLALGSLFSSRQ 


5833 


170 


3289 


SI LCLLS PCVVQFGKPWS ILSSRSRHSPCTKKGWEGMRKHIjHT J 
RQGHK* VHVE I S KALWVYRDDY F IRHS IS VS AVI VRAWI THKYR 
GRDWNVKWEENtiLHAVAKNYTLLQTI PPFERPFKDHQVCLEWNM 
GYIWNIJIANRIPQCPLENDVVALLGFPYASSGENTGIVKKFPRF 
RNRELEATRRQRMDYPVFWSLWLYI»tjHYCKANLCGILYFVDSN 
EM YGTP S VFLTE EG YLH IQMHLVKGEDLAVKTKF I IPLKEWFRL 
DISFNGGQIVVTTSIGQDLKSYHNQTISFREDFHYNDTAGYFII 
GGSRYVAGIEGFFGPLKYYRLRSLHPAQIFNPLLEKQLAEQIKL 
YYERCAE VQE I VS VYASAAKHGGERQ EACKLHNS YLDLQRRYGR 
PSMCRAFPWEKELKDKHPS LFQALL EMDLLTVPRNQNBS VS BIG 
GKIFB KAVKRLSS IDGLHQISS IVP FLTDSSCCGYHKASYYIiAV 
FYETGLNVPRDQLQGML YS LVGGQGS ERLS SMNtiGYTOFTSf QG I DN 
YPLDWELS YAY YSNTATKT PLDQHTLQGDQAYVETI R T iKDD E IL 
KVQT KS DG DVFMW LKHEAT RGNAAAQQRLAQML FWGQQ GVAKNP 
RAAI E WYAKGALETEDPAL I YDYAI VLFKGQGVKKNRRLALELM 
KXAAS KGLKQAVNGLG WYYHKFKKNYA\ KAAKYNLKA\ EE \ KGN 
PDASYNLGVLHLDGIFPGVPGRNQTLAGEYFHKAAQGGHMEGTL 
WCSLYYITGNLETFPRDPEKAVVWAKHVAEKNGYLGHVIRKGLN 
AYLEGSWHEALLYYVLAABTGIEVSQTNIAHICEERPDLARRYIi 
GVNCVWRYYNFS VFQI DAPS FAYLKMGDLYYYGHQNQSQDLEIjS 
VQHYAQAALDGDSQGFFWLALLIEEGT^IPHHILDFLEIDSTIjH 
SNNISILQELYERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH 
SALIYFLGTFLLS ILIAWTVQYFQSVSASDPPPRPSQASPDTAT 
STASPAVTPAADASDQDQPTVTNNPEPRG 


S834 


17 


4020 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPPRTPKPPRT/QECG" 
SAAPGPIPGQSSS^VPLRLEQIQQKADCPLSLELALKPRMAAQV 
TLBDALSNVDLLEELPLPDQQPCI EP PPSS LLYQ PNFNTNFEDR 
NAFVTG I ARY I EQATVHS SMN EMLEEG QEYAVMLYTWR SCSRAI 
PQVKCNEQPNRVEI YEKTVEVLEPEVTKIWNFMYFQRNAIERFC 
GEVRRLCHAERRKDFVSEAYLITI^KFIl^FAVIJDELKNMKCSV 
KNDHSAYKRAAQFLRKMADPQSIQBSQNLSMFLANHNKITQSLQ 
(X3LBVISGYEEIlLADIVNLCVDYYENR^4YLTFSEKHMLI^KV^IGF 
GLYLMDGS VSNIYKLDAKKR INLSKI DKYPKQLQWPLFGDMQ I 
ELARYIKTSAHYEBNKSRWTCTSSGSSPQYNICEQMIQIRBDHM 
RFISELARYSKS E WTGSGRQE AQKTDAEYRKL FDLALQGIiG;I,I, 
SQWSAHVMEVYSWKtiVHPTD3CYSNKDCPDSAEEYERATRYNYTS 
EEKFALVBVIAMIKGLQVLMGRMESVFNHAIRHTVYAALQDFSQ 
VTXMEPLRQAIKKKKNVIQSVI»QAIRKTVCDWETGHEPFNDPAI* 
RGEKDPKSG*D2 KVPRRA.VGPSSTQLYI4VRTMLESL IADKSGSK 
KTLRS SLEGPT I LD IEKFKRES FFYTHL INFSETLQQCCDLSQI. 
WFREFFLSLTMGRRIQFPIEMSMPWILTDHILETKEASMMEYVI* ' 
YSLDLYNDSAHYALTRFNKQFLYDEIEAEVNLCFDQFVYKZiADQ 
IFAYYKVMAGSLLLDKRLRSECKNQGATIHLPPSNRYETLI.KQR 
HVQLLGR S IDIiNRLI TQRVSAAM YKS LELAIGRFESEDLTS I VE 
LDGLLE INRMTHKLLSR YLTLDGFDAMFREANHWSAP YGRI TL 1 ' 
HVFWELNYDFLPNYCYNGSTNRFVRTVLPFSQEFQRDKQPNAQP 1 
QYLHGSIcALNLAYSSIYGSYRNFVQPPHFQVICRLLGYQGIAVV 
MBELLKVVKSLLG^TILQYVKTLMEVMPiaCRLPRHEYGSPGIL 
EPFHKQLKDI VE YAELKTVCFQNLREVGNAILFCLLI EQSLS LE 
E VCDLLHAAPFQN I LPRVHVKEGERLDAKM KRLESKYAPLHLVP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to 'first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, c=Cysteine, D»Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N-Asparagine, 
P=Proline, Q=Glut amine, R*Arginine, 
S^Serine, T=Threonine, V^Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknovm, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








L IERLGTPQQ 1 A IAREGDLLTKERLCCGLSM FEVI I/TR IRS F£»D 
DPIWRGPLPSNGVMHVDECVEFHRLWSAMQFVYCIPVSTHEFTV 
EQCFGDGLHWAGCMIIVLLGQQRRFAVLDFCYHLLKVQKKDGKD 
EIIKNVPLKKMVERIRKFQILNDEIITILDKVLKSGDGEGTPVB 
HVRCFQPPIHQSIASS 


$835 


4209 


1904 


SGNI RMAQGSHQID FQVLHDLRQKFPEVPETWSRCMLQNNNNli 
DACCAVLSQESTRYLYGEGDliNFSDDSGISGLRNHMTSLKLDLQ 
SQNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSSSSGASNSAPHLGFHLGSKGTSSLSQQT 
PRFNPIMVTLAPNI QTGRNTPTSLHIHGVPPPVLNSPQGNS I YI 
RP Y ITTPGGTTRQrQQHSGW VS Q FNPMNPQQ VYQPSQ PG P WTTC 
PASNPLSHTSSQQPNQQGHQTSHVYMP I SSPTTSQP PTIHS SGS 
SQSSAKSQYNIQNISTGPRKNQIEIKIiEPPQRNNSSKLRSSGPR 
TSSTSSSVNSQTLNRNQPTVYIAASPPNTDELMSRSQPKVYISA 
WAATGDEQVMRNQPTLFISTNSGASAASRNMSGQVSMGPAFIHH 
HPPKSRAIGNNSArSPRWVTQPNT\EYTFKITVSPN£CPPAVSP 
GWSPTFELTNLLNHPDHYVETENIHHLTDPTLAHVDRISETRK 
I>SMGSDDAAYTQDI*RISNSWLGMVAHACN"SSALGGQDGRII*A 
QEFETS WGNI WRIiRLYRRF*NYAGMVAHTCSPS YSVD*ALIiVHQ 
KARKERLQRELEIQKKKLDKLXSEVNEtlENNLTRRRLXRSNSIS 
Q I PS liEEMQQLRSCNRQLQI D I DCLTKE I DLFOARG PHFNPS AI 
HNFYDN I GFVG P VPP KPKDQRS 1 1 KT PKTQDTEDDEGAQWNC7A 
CTFLNHPALI RCEQCEMPRHP 


5836 


361 


2303 


FHITMCGICCSVNFSAEHFSQDLKBDLLYNLkQRGPNSSKQLLK" 
S DVNYQ CLFSAHVLHIiRG VLTTQPVEDERGNVFLWNG E IFS G I K 
VEAEEUDTQILFNYIiSSCKNESEILSLFSSVQGPWSFIYYQASS 
HYLWFGRDFFGRRSLLWHFSNLGKSFCLSSVGTQTSGLANQWQE 
VPAS\DFSBLILSLI*SFPDALFYNCIIiGNIFIjGRILLKKMIiIA* 
VXFQQTYQHLYOR* QMKPNCILECNLLFL* I *CCHKLHWRLI AVI 

FPMCHLQERYFKS pllmyt*kevtqqfi dvls vavkkrvlclpr 
DENLTANEVLKTCDRKANVAILFSGGIDSMVIATLADRHIPLDE 
PI0LLNVAFIAEEKTMPTTFNR3GNKQKNKCEIPSEEFSKDVAA 
AAADSPNKHVSVPDRITGRAGLKELQAVSPSRIWNFVEINVSME 
ELQKLRRTRICHLIRPLDTVLDDSIGCAVWFASRGIGWLVAQEG 
VKSYQSNAKVVLTGIGADEQLAGYSRHRVRFQSHGLEGLNKEIM 
MELGRISSRNLGRDDRVIGDHGKEARFPFLDENWSFLMSLPIW 
EKANLTLPRGIGEKLLLRIAAV3LGLTASALLPKRAMQFGSRIA 
KMEKINEKASDKCGRLQIMSLENLSIBKETKL 


S837 


4792 


903 


NGNAVAQAP VTIJCCYIjATGSKDQTIR I WSCS RGRGVMI LKLPFL 
KRRGGG I DPTVKERI*WLTI*HWPSNQPTQI*VS S CFGGELLQWDLT 
QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 
RDVKCKDIATLECSWTLPSLGGFAYSLAFSSVDIGSIxAIGVGDG 
M IRVWNTLS I KNNYDVXNFWQGVKSKVTALCWHPTKEGCLAFGT 
DDGKVGLYDTYSNKPPQISSTYHKKTVYTLAWGPPVPPMSLGGE 
GDRPSLAL YS CGGEG I VLQHN PWKLSGEAFDI NKLIRDTNS I KY 
KL P VHTEI S WKADGK IMALGNEDGSIE I FQ\ I PKLKL I CTIQQH 
HKL VNTI S WHHE\ HGSPAQXLS YIi \MPSGSQQCS PFTCHNLKNC 
P* KAAPBS PSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH* WEGL 
VFCFPIDGYSPGCWD\AFPGKEAPVAIFRG\HQGRLLCVAWSPL 
DPDCIYSG\ADDFCVHKWLTSMQDHSRPPQGKKSIBLEKKRLSQ 
PKAKPKKKKKPTLRTPVKLESIDGNEEESMKBMSGPVENGVSDQ 

ILLKKEPP KEKPETLIKKRKARSLLPLSTSLDHRSKEEI»HQDCL 
VLATAKHSRELNE D VSADVEERFHLGLFTDRATLYRM IDI EGKG 
HLENG HP E LFHQLMLW KGDLKGVLQ TAAERG EI*TDNL VAMAP AA 
GYHVWLWAVEAFAKQLCFQDQ YVKAASHLLS IHKVYE AVELLKS 
NHFYRBAIAIAKARLRPSDPVLK0LYLSWGTVLBRDGHYAVAAK 
CY1X3ATCAYDAAKVLAKKGDAASLRTAAELAAIVGEDELSASLA 
LRCAQELLLANNWVGAQ2ALQLHE5LQ>3QRLVFCX1»ELLSRHLE 
EKQLSEGKSSSSYHTWNTGTEGPFVERVTAVWKSXFSLOTPEQY 
QEAFQKLQNIKYPSATNNTPAKQIj^LHICHDLTLAVLSQQMASW 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing aignal peptide 
(AaAlanine, C-Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, GsGlycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M«Methionine, N^Asparagine, 
P«Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
WoTryptophan, Y= Tyro sine, X= Unknown, *«Stop 
Codon, /-pocsible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALLRAWRSYDSGS FT IMQBVYSAFLPDGCDHLRDKtGD " 

HQS PATPAF KSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 

TLSVEPSQQIiDTASTEETDPSTSQPEPNRPSEIiDIiRLTEEGERM 

LSTFKELFSEKHASLQNSQRTVAEVQETLAEMIRQHQKSQLCKS 

TANGPDKNBPEVEABQPLCSSQSQCKEEKNEPLSLPELTKRI>TE 

ANQRMAKFPESIKAWPFPDVLECCliVIJjLIRSHFPGCLAQEMQQ 

QAQEIjLQKYGNTKTYRRHCQTFCM 


5838 


110 


98 


KTMPHLLVT FRDVAID FS Q BE WE CLDPAQRDLYRD VMLEN YSNL 
IS LDLESS CVTKKLS P EKE I YEMES \ PSGR I WGNVST I T FQYNG 
DGDNMECKGNLEGQVSKSEGLYMCVKITCEEKATESHSTSSTFH 
RII /HYQGKI VKCKECRQG PSYLSCLIQHEENKNI* KCSEVNKH 
RNTFSKKPS Y I * HQ \ K FRLGEKP YE CMSCGXAFGRTSDL I QHQK 
IHTNEKPYQCNACGKAFI RGSQLTEHQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKPYECKECGKAFILGSHLTYHQRVHTGEKPYICKECGKAFLCA 
SQLNEHQRIHTGE KP YECKEGGKTF FRGSQLTYHLRVHSGERPY 
KCKECGKAF ISNSNL I QUQRIHTGEKP YKCKECGKAF ICGKQLS 
EHQRIHTGEKPFECKECGKAFIRVAYLTQHEKIHGEKHYECKEC 
GKTFVRATQLTYHQR IHTGEKP YKCKECDKAF/HLWLTI LSEHQ 
RIKRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFSRGSEHTIjHQR I HTG E KP YTCVQCGXDFR CPSQLTQHTRI* 
HN*EYSSHKICMHSIAIASLDFAHLQEKNPEN 


5839 


l 


2425 


GRPFPRPPRAI,PRLPLRGRRQDGRWTVDFEECLKD\SPRFRAAL 
EEVEGD VAELELKL\DKLVKLCI A\MI DTGKAFCVANKQFMNG I 
RD\IAQNS \NNDA\ WETKFAPS FLDSLQEMINFHTIL/L* PNS 
E IN * GHS FQNF VKEDLRK F KDAKKQFBNSQ* KRKKI ALVKNAPV 
PSRPAS LBL * KP PNILTAT RKCFRHI ALDYVLQI NVIiQS KRRSE 
ILKSMLSFMYAHIiAFFHQGYDLFSELGPYWKDLGAQLDRLVGDA 
AKEKREMEGKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 
RASNAFKTWNRRWFSIQNNQVVYQKKFKDNPTVVVEDLRLCTVK 
HCEDIERRFCFBVVSPTKSCMLQADSEKLRQAMIKAVQTSI\AT 
AYRBKDDESEKLDKKSSPSTGSLDSGNESKBKLLKGESAIiQRVQ 
CIPGNASCCDCGLADPRWASINLGITLCIECSGIHRSLGVHFSK 
VRSLTl^WEPELLKLMCELGNDVINRVYEANVEKMGIKKPQPG 
QRQEKEAYIRAKYVERKFVDKIFL*SLSPP\BQQKK\FVSKSSB 
EKRLS I SKFX3P\GDQVRASAQSSVRSNDSGIQQS SDDGRBSLPS 
TVSANS LYEPBGERQDSSMFLDSKHIiNPGLQLYRASYEKNLPKM 
AEALAHG ADVNVIANSBENKATPLI QAVLGGSLVTCEFLLQNGAN 
VMQRD VQGRG PLHHATVI^HTGQVCL FZJCRGANQHATDEEGKDP 
LSIAVEAANADIVTLLRLARMNEEMRESEGLYGQPGDBTYQDIP 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHLPRQHLTTLWQI SSPRWRSPQRAFMSAiiS KTQTQSAPALQ 
GLSS IiLQS VTGNPVPASBAASQSTSASPANTTVYTI KGRNLPSS 
AQPFI PKSFN YS PNSSTSEVSSTSASKAS IGQSPGLPSTAFKLP 
SNTKGFTATHNTSPAAPPTEVTICQSSEVSKPKL\ESESTSPSL 
\3HKIHNFUCGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSliL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQ PS DGMERP S S LKDS S QEKFYPDTS FQEDEDYRDFE YS GP 
PPS AMMNLQKKPAKS 1LKS S KLSDTTE YQ P I LS S YSHRAQB FGV 
KSAFPPSVRALliDSSENCDRLSSSPGLFGAFSVRGNBPGSDRSP 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 
IiFS PQNTLAAPTGH PPTSG VE KVLAS T ISTTSTI E F KNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQEEHY 
RI ETKVS5S GLD LPDSTBE KG AP IETLG YHSASNRRMSGEP IQT 
VESIRVPGKGNRGHGREASRVGWFDLSTSGSSFDNGPSSASELA 
SIiGGGGSGGIjTGFKTAPYKERAPQFQESVGSFRStrS FNSTFEHH 
LPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHLPS 
VDLSNPFTKEAALAHAAPPPPPGEHSGIPFPTPPPPPPPGEHSS 
SGGSGVPFSTPPPPPPPVDHSGWPPPAPPLAEHGVAGAVAVFP 
KDHSS LLQGTLAEH FGVLPGPRDHGG PTQRDLNGPGLSRVRESI* 
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ID 
j NO: 


beginning 
nucleotide 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

CojJUIl Hilly 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H-Hlstidine, i=Isoleucine, K=Lysine, 
J,*Leucine, M^Methionine, N«Asparagine, 
P=Proline, Q=Clut amine, R-Arginine, 
S=Serine, T-Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, *»stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


h $841 ■ 






TIaP SHS LEtiliGPPHGGGQGGGSNS S SGP PLGP5HRDT IS RSd 1 1 

LRSPRPDFRPREPFLSRDPPHSLKRPRPPFARGPPFPAPKRPFF 
PPRY" 




1908 


762 


glr^flvltvwpmmkpswlsrtepskrllcrtlWcqsgwssrsy 

TRSMLKMTTS INRRSRTSTKSTRTS AR PGLTATVS IGLSDS PTW 
RHCNMTARSCSGEKGGHWAPRQVGVYLLPGRVGCVSSRVSPSFP 
GDGLDSGLARRGSAVSALASGLVEEPMLGPPFHPTPRFKAVSAK 
SKEDLVSQGFTEFT'IEDFHNTFMDI.IBQVEKQTSVADLLASFND 
QSTSDYLWYLRLLTSGYLQRESKFFEHFIBGGRTVKEFCQXQE 
\ VE PMCKESDH IH I 1 ALAQGLQRVHPGWE YMG PRPRAATTNPHT 
FP*GLPSPKVYI,LYRPG\HYDILYK1GLGSSPLGCPGCPLLARA 
LGHCYRGFSWVKWS YFTPFFLSHDP PPMFY 


5842 
S843 


307 


1918 


qeptadfio.rstcgcgremtcpdkpgqlinWficSLc^prvr^- 

WSSRRPRTRRNLLLGTACAIYLGFLVSQVGRASLQHC-OAAEKGP 
HRSRDTAEPSFPEIPLDGTLAPPESQGNGSTLQPNWVITLRSK 
RSKPANIRGT VXPKRR KKHAVAS AAPGQEALVG P SLQ PQEA\EG 
KLML^HLGTLREQTWLRLESDPGGWCGVRB/WRAGGPDFLQPSS 
RESNIRIYSESAPSWLSKDDrRRMRLIADSAVAGLRPVSSRSGA 
RLLVLEGGAPGAVLRCGPSPCGLIiKQPLDMSEVFAFHLDRILGL 
NRTLPSVSRKAEFIQDGRPCPIILWDASLSSASNDTHSSVKLTW 
GTYQQLIiKQKCWQNGRVPKPESGCTEI HHHEWS KMALFDFLLQI 
YWRLDTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHI IQRKH 
DPRHLVFIDNKGFFDRSEDNLNFKLLEGIKEFPASAVYVLKSQH 

LRQKLLQSLFLDKGYWESQGGRQGIEKLIDVIEHRAKILITYIN 
AHGVXVLPMNE 


5844 


500 


1453 


GXARliVTCWVLHGO*VKKPAWEPGVVWL*Q*RCRPKGWGLGAGM 
R3SRMS QPPQCLRRAQS S CCHFMVKLLDDGTFMI PGEKVAHTSL 
DALVTFHQQKP I E PRRBLLTQ PCRQ KDPANVDYED LFI, YSNAVA 
EEAACPVSAPEEASPKPVLCHQSKERKPSAEM/RQNNHQGSHFE. 
LPPKI PS WRD p P ETLEE PQNAPRERP EGPAAAKKP PRHCB LWT 

LGCPEIHGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGH1SQKP 
LTAPGT JCRQKG PHQEGREVGQLH*GD PRGQELA PNGS ES P I LPG 
VQARAPGLGRA 




202 


2471 


FDSAVLSSINVMAVLPGPLQLLGVLLTISLSSIRLIQAGAYYGI 

KPLPPQIPPQMPPQlpQYQpi^GQQVPHMPLAKDGliAMGKEMPHL 

OYGKEYPHLPQYMKEIQPAPRMGKEAVPKKGKE I PLASLRGEQG 

PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 

PGKPGAMGMPGAKGEIGQKGEIGPMGIP*PQGPPGPHGLPGIGK 

PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPOAPGV 

KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 

GPQGP1GVPGVQGPPGIPGIGKPGQDG\IPGQPGFPGGKGEQGL 

PGLPGP PGLPGI GKPGFPGPKGDRGMGGVPGALG PRGEKGP IGA 

PGIGGP PGE PGLPG I PGPMGPPGAIGFPGPKGEGG I VGPQG PPG 

PKGEPGLQGFPGKPGFLGEVGPPGMRGFPGPIGPKGEHGQKGVP 

GLPGVPGLLGPKGEPG I PGDQGLQGPPG I PG IGG PSGP IGP PG I 

PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPOGQPGL 

PGPPGPPGPPGPPAVMPPTPPPOGEYLPDMGLGIDGVKPPHAYG 

AKKGKKGGPAYEMPAFTAELTAPFPPVGAPVKFNKLLYNGRONY 

NPQTGIFTCEVPGVYYFAYHVHCKGGNVWVALFKNNEPVMYTYD 

B YKKG FLDQASGS A VLLLR PGDR VFLQM P S EQAAGL YAG Q YVFS 

SFSGYLLYPM 


5645 


215 


2061 

< 

; 

i 


HASNKSASLQDKMANPK£KTAMCLVNELARFN^VnpnYyT djcp ~ 
G PAHSKMFS VQLS LGEQTWESEG SS I KKAQQAVGNKALTESTLP 
KPI*KPPKSNVNNNPGCITPTVELNGLAMKRG\KPAIHRPLDPK 
PFPNNRANYNFQVKYNQRVHCP I PKI FYVQLTVGNNEFFGBGKT 
RQAARKNAAMKALCALQNEPI p ERSPQNGES GKDiMDDDKDANKS 
BISLVFEIALKRNMPVSFEVIKESGPPHMKS FVTRVSVGEFSAE 
3EGNS KKLSKKRAATT VLQELKKLP PLPVVEKPK\HFFKKRPKT 
IVKAGPEYGQGMNPISRLAQIQQAKKEKEPDYVLLSERGMPRRR 
2 FVMQVKVGNEVATGTGPNKKIAKKNAAEAMLLQLGYKASTN^ 
X}LEKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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ID 
NO: 


1 Predicted ~"~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
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amino acid 
sequence 


Arruno acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H«Histidine, J>Isoleucine, K«Lysine, 
L= Leucine, M=Methionine, N^Asparagine, 

P = Proline» 0=Gl li h;*n>i n#» D.lvn4n4*ia 

S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y- Tyro sine, X=Unknown, ^Stop 
Codon, /=»possible nucleotide deletion, 
_ \-possible nucleotide insertion) 


5846 






RHKVISGTti^SYLSPKDMNQPSSSFFSISPTSNSSATIAREUiM 

n ROSGKECVTCLTIiAPVQMTFHAIGSSlEASHDQV*YATAILLC 

YGPARKWKAIKMEAMCAHAALI^LIHYLIAPSARLEKSKLFALG 

N- 




1126 


456 


FSKLIKKTFl IGISGVTNSGKTTIiAKNI#QKHIjPNCSVI SQDDFF 

KPE S E XETD KNGFLQ YD VLEALNMEKMMS AIS CWKES ARHS VVS 

TDQ3S AEE I P 11*1 I EGFLLFNYKPLDTI WNRS YFLT I P Y3ECKR 

RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEWYIjDGT 

KSEEDLFLQVYEDLIQELAKQKCLQVTA*RRNTTWPS/CK*IRK 
LQGVI 


5647 
584B 


2769 


SOS 


apemedlsspdstllqgghnllssasfqesvtfkdvivdftqee 

WKQLDPGQRDLFRDVTLENYTHLVSIGIjQVSKPDVISQLEQGTE 
P WIMEPS I PVGTCADWETRLENSVSAPEPD1 SEEELSPE VI VEK 

kkrddswssnlleswbyegslerqqanqqtlpkeikvtektips 

WEKGPVlWEI^KSVr^SSNLVTQEPSPEETSTKRSIKQNSNPVK 
KEKSCKCNECGKAFSYCSALIRHQRTHTGBKPYKCN*/CVEKAF 

srsendinhqrihtgdkpykcdqcgkgfiegpsltqhqrihtge 

K P YKCDEOGKAFS QRTHLVQHQRI HTGEKP YTCNEOGKAFSQRG 
H FMEHQK IHTGEKPF KCD B CDKTFTRS THLTQHQ KIHTGEKTYK 
CNECG KAFNG PSTFIRHHM IHTGEKP YECNECG KAFSQHSNLTQ 
HQKTHTGEKPYDCAECGKSFSYWSSLAQHLKIHTGEKPYKCNEC 
G KAFS YCSSLTQHRR IHTREKPFECS E CGKAFS YLSNLNQHQ KT 
HTQE KAYECKECGKAFIRSSSLAKHER I HTGEKP YQCHECGKTF 
•SYGSSLIQHRKIHTGERPYKCNEOGRAFNQNIHLTQHKRIHTGA 
KP YE CASCG KAFRHCSS LAQHQKTHTE EKPYQCNKCEKTFS QSS 
HLTQHQRI HTGE KP YKCNECDKAFS RS THLTQHQRIHTGB KP YK 

QJBCGK\TFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSALN 
KHQRLHPGI 


" SB49 " 


22 


2961 


AAPRRIiLRGGDGDRTPRFPLPALiLRPGPPAEAAPERRKWPAVSK 
GDGMRGUVVFI SD IRNCKSKEAElKRINKEIiANI RSKFKGDKAL 

DGY5KKKYVCK1^FIFLI^HDIDFX^EAVNLLSSWRYTEKQIG 
YLFISViVNSNSELIRLINNAIKNDLASRNPTFMGLALHCIASV 
GSREMAEAFAGEIPKVLVAGDTMDSVKQSAALCLLRLYRTSPDL 
VPMGD WTSRVVHLLNDQHLGVVTAATS h I TTLAQ KNP EEFKTS V 
SLAVSRLSXRIVTSASIDLQDYTY^FCPGFLGIiSVKLLRLLQCY 
P PPDPAVRGRLTECLETIIiNKAQEPPKS KKVQHSNAKNAVL FEA 
ISLIIHHDSEPNLLVRACNQLGQFLQHRETNLRYLALESMCTLA 
3 3EFS HEAVKTHI BT VINALKTER DVS VRQRAVDLLYAMCDRSN 
APQ IVAEMLS YLETADYS IREB I VLKVA I LAEKYAVDYTW\ YVD 
TILNLIRIAGDYVSEEVWYRVIQIVIKRDDVQGYAAKTVFEALQ 
APACMENLVKVGGY ILGEFGNLIAGDPRS SPLIQFHLLH3 KFHL 
c VBiUrViL VKPTIQDVLRSDSQLRNADVEL 
QQ RAVE YLRLSTVASTD I LATVLEKMPPFPERESS I LAKLKKKK 
GPS TVTDLEDTKRDRS VDVNGGPEPAPAS TSAVSTP S PSADLLG 
LGAAPPAPAGPPPSSGGSGLLVDVFSDSASWAPLAPGSEDNFA 
RFVCKNNGVLFENQLLQIGLKSEFRQNLGRMFI FYGWKTSTQPL 
NFTPTL I CSDD LQPNLN T QTKPVD PTVEGGAQVQQVVNT E CVS D 
FTEAPVLN I QFRYGGTFQNVS VQLP ITLNKFFQPTEMAS QDFFQ 
RWKQLSNPQQE VQNI FKAKHPMDTEVTKAK I IG FGS ALLESVD P 
NPANTFVGAGI t HTKTTQIGCIXRLEPNLQAQMYRLTLRTS KEAV 
SQRLCELLSAQF 




3545 


1895 

] 
( 
3 


KRRE I KET VFHHVAQ AGLELLS S SUP PSSAS RSAGI TGMRHQVQ 
P*DPCMSLS P PCFTEEDRFSLEALQTIHKQMDDDKDGG IEVEES 
DEFIREDMKYKDATNKHSIILHREDKHITIEDLWK^KTSEVHNW 
rLEDTLQWLIEFVELPQYEKNFRDNNVKGTTLPRlAVHEPSFMI 
SQLKISDRSHRQKLQLKAIjDWLFOPLTRP PHNWMKDFILTVSI 
/I G VGGCW F AYTQNKTS KEHVAKMMKDLES LQTAEQSIiMD LQER 
jBKAQE GNRNVAVEKQNL* RKMMDB INYAKEEACRLRE lregae 
2B LSRRQYAE^ELEQVRMAL KKAEKEFELRSS WS VPDALQKWIjQ 
.THEVEVQYYWIKRQNAEMQLAIAKDEAEKIKKXRSTVFGTI^ 
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— qpn — 
ID 
NO: 


frcaa c cea 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
res x due of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A= Ala nine, C= Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=*FUstidine, I»l3oleucine, K=Lysine, 
LoLeucine, M=Methioni ne , N=Asparagine, 
P=Proline, QsGlutamine, R=Arginine, 
S»S erine , ToThreonine , V»Val ine, 
W-Tryptophan, Y-Tyrosine, X -Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion? 








AHS SSLDE VDHKILEAKKALS ELTTCLRERIi?RWQQ I EKl CG FQ 
I AHNSGL PS LTS SLYS DH S W V VMPR VS I PP Y? I AGGVDDLDBDT 
PPrVSQPPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPAIiYRNEEE 
EBAIYFSAEKQWEVPDTASECDSLNSSIGRKQSPP/SKPRDIPN 
IIS/DERYQEMRCP+RIPSGGIL 


5950 


3 


1835 


KAVLNF SASGS VISLTGSNPMHDASMWHLKKNGI I VYLDVPLLN 
LI CRLKLMKTDRI VGQNSGTSMKDLL KFRRQYYKKW YDARVFCE 
SGAS PBB VADKVLNAI KRYQDVDSETFISTRHVWPBDCEQKVSA 
BFFIEAVIEGIiASDGGLFVPAXEFPXLSCGEWKSLVGATYVERA 
QILLERCIHPADIPAARLGBMIETAYGENFACSKIAPVRHLSGN 
QFII»ELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 
1 DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQIIGSO 
RENG WAVGVESDFDFCQTAI KR I FNDS D FTG FLT VEYG T I LSSA 
KS INWGRLLPQWYHASAYLDLVSQGFIS FGSP VDVCI PTGNFG 
K I LAAVYAKMMG IP 1 R KFI CASNQNHVWTDFI KTG \HYDLRGKE 
N* AQTFFTVQ* I FLPNI^NLERHLHLMANKDGQI,MTBLFNRIiES 
QHHFQIEKALVEKLQQDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAXWADR VQDKTCP VI I SSTAH YS KFAPAIMQALKIKE I 
NETS SSQLYLLGS YlUu^PLHEALLERTKQQEKMEYQVCAADMN 
VLKSHVEQLVQNQFI 


5851 


3120 


1802 


RCYLQFLALLLTSTSARAAAAIAAAEEPAGSPSVMTRAGDHMRQ " 
RGCCGS LAD YLTS AKFLLYLGHSLSTWGDRMWH FAVSVF1>VBL Y 
GN S LLLTAVYGLWAGS VLVLGAI IGD WVDKNARLKVAQTS LW 
QNVS VILCG I ILMMVFLHKHELLTMYHGWVLTSCYILI ITIANI 
ANIiASTATAITI QRDW1 VVVAGEDRS KLANMNATI RR I DQLTN I 
LAPMAVGQI MTFGSPV3 GCGFISGWNLVSMCVEYVLLWKVYQKT 
PALAVKAGLKEEETELKQLNLHKDTEPKPLBGTHLMGVKDSNIH 
ELEHEQEPTCASQMAEPFRTFRDGWVSYYNQPVF/LGWHGSCFP 
LYDCPGL* LHHHRVRIiHSGTENFHPQ YFD3S I S YNWJNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFliFDLVCDLCIHAWKPPGLVRFSF 


5852 


1 


422 


K'rrFPS«LCPIiRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG~ 

SSSYPSLPALLRARSAFGHCTHRSCGPEWRIDSISRLEMQGARR 

SGWAQAQPTILLLVPRLRKSLPSIWG/SLMGFFITSGPG/WFRQ 

YYFFISGRH*VLFTBSEFYYVAMDFGGHGL9SHYSPGVPYYLQT 

FVSB I RRWAG KKQS VYFRRCGGCSRAP PLITGGGVGSRKQRWP 

ESGAWAIiAPGIiPAIHGRSWES 


5853 


223 


1346 


RLLGLSRVKGLHGPAASAWISDPBTRGDPGGPWGMWRGg-DLRPR 
PVSLTGLrLVCK*AAQGPQV\HSVKLCFGLGG\PCLL\FPIFRP 
LLLKPRR PRLKPGTRGVAVEPHALRVVH VAHGEEAGI RAAGPGH 
GGVEIPQG/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRRRFPPDPALTCPGLGQDQGPREQQKQGSGRHDTILGDWGESE 
SRWVRGNFRTGTAATLIGFSRNPTLNGSENWGSLVSIQEBGPDT 
GWEREKRNP AEMGNPQRWASP I HTPPLG PEI LRAMPE ALRAM PE 
ALGLRPDPATSVPSALS/QTF/PESMPRSCLRNQGETLGMGPVP 
LSSLCITESPSQNWTPCLLLLTCPRGLF 


5854 


86 


938 


KGRNTAPEKKGAAlJjNRENA^a^^GV'/SRWKQDIRRIENHIIQE ' 
LXHL CAM I KRVLLERLENTRKLRELTEGRTLDWPQNRITEJVSAK 
RQIVTEYREKGKRN*EEKKRDLEGRSRRYNLCIIGIPETBDRAS 
GAETI KDIiLE/ENFPELKNELDLQMEKAHR I PLKFNEKKAASRH 
IRVTFL/KFQRRNILGASSORKOVTYKGAKVRLTSDFSPaTT n& 

RRQW/N/PISRVLRENNFEPRIIYSAKLSFLYKGNWKTFLDIQG 
LGK YINQELS LK ILLKDLLQLTENLN 


5855 


536 


2391 


LRSYGCKAPSRISHLHK\FLFLLLPSLLWGYSESPPPITDSWAP 
FI SLTHHVLS QSQS PLS SNCW I CLS THTQ * FTAL PADLLTWTQS 
NVSLHISYLAIPFLAI)9FLKPV/I»*PGNSAKHLSFKLSSLSMVS 

gravallhltasgltsiqtntasskppiwgyVlstqtsfisppp 
lclsrtypnpahatmvgqvpqs lcgliftl/rtp crps ilhpny 

KirsrSAWQKVLCPSGSPTIHTSLHLTTGSSFLSFHPIPGFPAA 
NSALYVSSLKGPPGKNVTIPSPVTGT*QPPHRGSN/RLTVDKDN 
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SEQ~ 
ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


freaiccea end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptioV 
(A=Alanine, C=cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*» Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K-^Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R»Axginine, 
SaSerine, T«Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Loaon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLSPKPNSLHQLPSQ\TPYQAJtiTGJ\ALAGSyPlt4E^feNTLSWL 
PTFI^NFCLSTPSLFFLCDTN+YLCLPANWSGTCTLVFQAPTIN 
ILPPNQTI L I S VBAS IS SSP I RNKWALHL I TLLTGLG 1 TAALGT 
G I AG I TTS I TS YQTLFTTLSITTVEDMHTS I TS LQRQLDFLVGVI 
LQNWRVLDLXTTE KGGTC I YLQEECCFCVNESG I VH IAVRRLHD 
RAASL*HQVADSWWQGS3LI,RWIPWVAPFIiGPLIFLFLLLMIGP 
C1FNLVSRF ISQRLNCFIQASMQKHIDNIFHLCHV* YQSLRGNH 
SEAPEPRP 


5856 


173 


1137 - 


PWLHGLGLSAVFLFYL* / YVTFHLYGGI 2LI*LLIFI SlSciLYK 
FQDVLLYFPEQPSSSRLYVPMPTGIPHENIFIRTKDGIRLNLII* 
IRYTGDNS PYS PTI I YFHGNAGNIGHRLPNALLMLVNLKVNLLL 
VD YRG YG KS EG BAS EBGIjYLDSEAVIjDYVMTS PDLD KTKI YLSG 
RSLG\GAAAIHLASDNSHRISAIMVENTFLSIPHMASTLFSFFP 
MRYIiPtiWCYKNKFLSYRKISQCRMPSLFISGLSDQLIPpVMKKQ 

LYELSPSRTKRLAIPPDGTHNDTWQCQGYFTALEQFIKEWKSH 
SPEEMAKTSSNVTII 


5857 


1597 


5*3 


KLIGKVLV1*SWADAMAAFAVEPQGPALGSEPMMLGSPTSPKPG 
VNAQFLPGFLMGDLPAPVTPQPRSISGPSVGVMEMRSPLLAGGS 
PPQPWPAHKDSSQAPPVRSIYDDISSPGLGSTPLTSRRQPNIS 
VMQSPLVGVTSTPGTGQSMFSPASIGQPRKTTLSPAQLDPFYTQ 
GDSIiTSEDH \ LDDS WGDC I WGFIiKAS A\ S Y I LlAQFAQYGGIS * 
NMWMSNTGNWMHIRYQSKLQARKA1»SKDGRIFGES1MIGVICPCI 
EKSVMESSDRCALSSPSLAFTPPIKTLGTPTQPGSTPRISTMRP 
IATAYKASTSDYQVISDRQTPKKDESLVSKAMEYMFGW 


5B58 


355 


1419 


PPHQPAAASTSXHQQO^PPPPPQDSSKPWAQGPGPAPGVGSAP 
PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
PSSGVPTTPPQAGGPPPPPAAVPGPGPGPKOGPGPGGPKGGKMP 
GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPPPGGPGGRSEEKISGPRRGFKANLSIJjRRPGEKTYTQRCRFC 
LLGI YLLISRRMNSRRL FAKIWENQEKFLSTKAKDSEFI KLESR 
ALA* NCPKFELG * YTP+GGRQLPSS LFPTHACLPLSCS VI FS PF 

MFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCLN 
FAS 


5859 


307 


1503 


GGSSARPRASSRRMLSRKKTKNEVS KPAEVQGKYVKKETSPLLR 
NLMPS FIRHGPTI PRRTDI CLPDSSPNAFSTSGDGWSRNQSFL 
RTP I QRTPHE IMRRSSNRLS APS YLARS LADVP REYGS SQS F VT 
EVS F AVENGDSGS R Y YYSDN FFDGQRKRPLGDRAHED YRYYEYN 
HDLFQRMPQNQGRHASG I GRVAATS LGNLTNHGS EDI* PLPPG WS 
VDWTMRGRKYYIDHNTNTTHWSHPLEREGLPPGWERVESSEFGT 
YYVDHTWKKAQY\RHPCAPTCTSV*STTSCHI/AS/RQQTERKQ 
SLLVPANP YHTAE I PDWLQVYARAP VKYDHILKWE LFQLADLDT 
YO^MLKLLFMKELEQ I VKMYEAYRQALLTELENRKQRQQW YAQQ 


5360 
58*1 " " 


2956 


1270 " " 


TIRVEEFPI^PGGGKAQl>SSASLIiGAGI^gpPTPPPU*LLLFP 
LLLFS RLCGALAGPI IVEPHVTAVWGKWVSLKCLI EVNETITQI 
S WEKI HGKSSQT VAVHHPQ YG FS VQGE YQGR VLFKNYS LNDA7I 
TLHNIG FSDSGKYI CKAVTFPLGNAQSSTTVTVLVEPTVSLIKG 
PDSLIDGK3NBTVAAICIAATGKPVAHIDWEGDLGEMESTTTSFP 
NETAT I IS Q YKLFPXRFARGRR I TCWKH PALEKD IR YS FILD I 
QYAPE VS VTG YDGNW FVGRKGVNL KCNADANPPPFKS VWSRLDG 
QWPDGLLASDNTLHFVHPLTFNYSGVYICKVT\NSPGSKEVTQK 
VHPTFQDPSLPTYPPLPALQFQWASPSTA*TSRD\LATEP*KIA 
psplstl\ati KGWTQLPTIIA* csgvgalfi V\LVKCFGLG I F 
CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDBLDPYPDSV 
KKENKNPVNKLIRKDYLEEPEKTQWNNVENLNRFERPMDYYEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 




2051 


1305 

J 


EVCACVQAFl*bVASSGDDSG^GDKCGCEVGSWGSMRVVMARLi» 
5EGEQGI PTACAAFAQQPAG/EPRRGIiAGVGEGGPQCS WVNYRC 
rLEFLVS LLGTDLARGRGNSASGPTAPADS KQL/ ML* D VHRRVI 
L*E * RMNSGSPARDNAPSQRFCTNLSEGLRFGI S PSWRE AL YGCH 
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SE(T 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


(A=Alanine, C=Cysteine, D=Aspartic Acid, B=* 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H-Histidine, I-Isoleucine, KcLysine, 
L»Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=S erine, T»Threonine, V*Valine, 
W»Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5862 


1556 


483 


PPFQLIMGBIKVSPDYNWFRGTVPLKKIlVliDDDSKIWSLYDAG 
PRS IRCPL I FLP P VSGTADVFFRQ I LALTGWGYRVT ALQYPVYW 
DHLEFCDGFRKLLDHLQLDKVHLFGASLGGFLAQKPAEYTHKSP 
RVHSLI LCNSFSDTSI FNQTWTANSFWLMPAFMLKKIVLGNFSS 
GPVDPMMADAIDFMVDRLEStiGQSELASRLTLMCQNSYVEPHpiI 
RD 1 PVT IMD V FDQS ALS TEAKEEMY KL YPNARRAHLKTGGNFP Y 
LCRSAE VNLYVQIHL/ R / RNS ME PNTR PLTHQWS VPRS LRCRKA 
ALASARRSSS VSLAVNDELTRCVLV* SVAS AP VSRPFPSGSSGS 
PVL7VSGK 


5863 


2714 


243 


PFPSRGSLPIiAAPREDTMGPI^LFCLLFLYPGU^APSCPQN" 
VNISGGTFTLSHGWAPGSLLTYSCPQGLYPSPASRLCKSSGQWQ 
TPGATRSLS KAVCKPVRCPAP VS F ENG I YTPRLGS YPVGGNVS F 
ECEDG F I \ LRGS P VRQCRPNGMWDGETAVCDNGAGHCPN PGI SL 
GP\ VRTGFR FGHGDKVRYRCS SNLVLTGSSERE CQGNG VW SGTE 
PICRQPYSYDFPEDVAPALGTSFSHMLGATNPTQKTKESLGRKI 
Q IQRSGHLNLYLLLDCS QS VSEND FLI FKESASLM VDR I FS F3 1 
NVS VAI ITFASEPKVLMS VLNDNS RDMTE VIS SIiEMAN YKDHSN 
GTGTNTYAALNSVYLMMNNQMRIiLGMETMAW\QEI RHAI ILL\T 
DGK\SHM3GSPKTAVDH3RE ILN INQKRNDYLDI YAIGVGKLDV 
DWRELNS LG S KKDG ERHAF I LQDTKALHQVFEHMLDVSKLTDTT 
CGVGNMS ANASDQERTPWHVTI KPKS0ET\ C\RGAI*1SDQWVLT 
AAHCFRDGNDHSLWRVNVGDPKSQWGKEFLIEKAVISPGFDVFA 
KKNQGIL\EFYGD\DIALL\KI^QKVKM\STHCQGPSCXP\CTM 
\EANLGFLRETFKGSTCR\DHENEL/VX7NKQSV\PAHF\VAI,\N 
GSKLEHLTliRMGYEWTS CCRGC>SPKKKTM\FPNLT\DVRB\ WT 
D\ QFL\ CS \GPQEDESP \CK* E\SGGA\ VFLERRKRJaSAGGVWC 
SWGL\YNP\CT.GSA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
Q*S PWI*RQHPGGMS * I FLPLLANGHLS PFACPAR I CRPLKFLPS 
EWATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPLLCFPGGQEPSAPS PCL-Y3 FLWACS FTMG 
KLPPS I PPSS PLACVLKNUCPLQLTPDLKPKCDI FFCNTAWPQY 
KLDNDSK* PBNGTFE PS ILQVLDNS CHKMGKWS E VPD VQAFF\ S 
HWSLPSLCSQC/GLI PNLSS FSPFCSFG/ PPPQVPSP /TESFFS 
MDSSDLPPSPQAAPRQAEPGPN3HLASAPPPYNPFITSPPHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


16B4 


CLPGPRWGEGWRAGHTIVGCIFFKTAirSHFKGGMYLCVCMCTC 
LSVCVCVQVGSWICV/CVSMCACVSLC7C\ICRCISMYTREHAC 
ACTRV+VYMCMS/VCTCVSTCIDVRVCAHVCVYMCLCLGYA*AC 
TCV*MCVCMHEHVaiC/VCACSCVLL/CRGHICM/MCMSAYICI 
/CVYVCVLC^ACMRMSTCWLVYG*ACTCVWMHM/CSCTCR/C 
VHVCCMSMHACBCLCVYI.HICGCAGTRRWWAGSARGSRSCSRLP 
CWAPGPGLSLPGPSCPSVEQGLGGGPGQLQGRSGBARLGEHRGW 
GSPAAVCSRNCTVS PRRGADCF5APDVPKQPPGWGRAS FEERG C 
GGRGW VCAPPLNG PQCCCFS 3 KPELKAKKKK 


5866 


98 


3197 


AR PEVPAP PAWLSRRGAAKMGDKKDD KDS P KKNKGKERRDLDDlj 
KKEVAMTEHKMS VEE VCRKYNTDCVQGLTHSKAQH ILARDGPNA 
LTPPPTTPEWVKFCRQLFGGFS ILLWXGAI LCFLAYGI QAGTED 
DPSGDhTLYLGIVIJUVVVIITGCFSYYQEAXSSKIMESFK>nviVPQ 
QALVI REGE KMQ VNAEEVWGDLVE I KGGDRVPADLRI I SAHG C 

RGVWATGDRTVMGRIATLASGLBVGKTPIAIEIEHFIQLITGV 
AVFIX3VS FF I LSLI LG YTWLEAVI FLIGI I VANVPEGLLATVTV 
CuTLTAKRMARKW CLVKNLEAVETLGSTS T I CSDKTGTLTQNRM 
TVAHK WFDNQIHEAD TTEDQSGTS FDKSS HTWVALF * H /LltG FC 
^\^KGGODNIPVLKRDVAGDASESALLKCIELSSGSVKLMRE 
RNKKVAEI PFNSTNKYQLSIHETEDPNDNRYLLVMKGAPERILD 
RCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYli 
PEEQFPKG FAFD CDDVN FTTDNLC FVGLMSM IGP PRAAVPDAVG 
KCRS AGI KV IMVTGDHP I TAKAIAKGVGI IFEGNETVEDIAARL 
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SEO 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
ecguence 


Amino acid segment containing oignal peptide 
(A=Alanine, C=cysteine, D*Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, X=Lyeine, 
J.=Leucine, M=Methionine, N-Aeparagine , 
P= Proline, Q-Glut amine, R«Arginine, 
S*Serine, T«Threonine , V=Valine, 
^-Tryptophan, Y=Tyrosine, X=Unknown. *=Stop 
Codon, /=possibls nucleotide deletion, 
\=possible nucleotide insertion) 








" MIPVSQVNPRDAKACVIHGTDLKDFTSEQIDEiltQNfaTEIVFAR 
TSPQQKLIIVHGCQRQGAIVAVTGDGVNDSPALKKADIGVAMGI 
AGS DVS KQAADM I LLDDNFAS rVTGVEEGRLlFDNUCKSIAYTL 
?SNIPEITPFLLFIMANIPLPLGTITILCIDLGTDMVPAISLAY 
EAAESDIMKRQPRNPRTDKLVNERLISMAYGQIGMIQALGGFTS 
YFVI LAENG FLPGNLVGIRLNWDD RTVNDLEDS YGQQW T YEQR K 
WE FTCHTAFFVS I VVVQWADLI I CKTRRNS VFQQGMKNKI L I F 
GLF2ETAIAAFLSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
YDEIRKLILRRNPGGWVEKETYY 


5867 


3 


1485 


LPGRRARGGRGLGWPPAQALIX3SRMGKAKVPASKRAPSS=VAKP " 
GPVKTLTRKJCNKKKKRFWKSKAREVSKKPASGPGAWRPPKAPE 
DPSQNWKALQEWIiIiKQKSQAPE KPLVISGMGS KKKPKI IQQNKK 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 
WFDDVDPADI EAA IGPE AAKI ARKQLGQSEGS VS LS LVKE Q AFG 
GLTRAIALDCEMVGVGPKGEESMAARVSIVNQYGKCVYDKYVKP 
TBPVTDYRTAVSGIRPENLKQGEELEWQKEVAEMLKGRILVGH 
ALHNDLKVLF LDHPKKK IRDTQ K YKP FKS Q VKSGRPS LRLL SEK 
I LGLQVQQAEH CS IQDAQAAMRL YVM VKKEWESMARDRRP LLTA 

PDHCSDDA*QSCPAAAAAPLQRQCDQSQGOITSPQSGNSGETFS 
ESWQRGVAWCY 


5866 


2122 


833 


LTAGAS HTQDASQSTS AKYPAAAQNX*/ C VTNAMR EDLADI W YI R 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHS PFRARSE PEDPV 
TERSAFTERDAGSGL VTRLRERPALLVS STSWTEDED FS ILLAA 
LESRV* T\MTLDGHNL PS LVCVI TGKGP LREYYSRLI KQKHFQH 
IOVCTPWLFJVED YPLLI/3S ADLG VCLHTS SSGLDLPM KWDMFG 
CCLPVCAVNFKGLHEIiVKHEENGLVFEDSEELAAQrjQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT | 


5869 


2122 


833 


LTAGASHTQDASQSTSAKX PAAAQNL/ CVTNAMREDLADIWYIR 
AVTVYDKPASFFKETPLDLQHRLFmO^SMHSPFRARSEPBDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDBDFSILLAA 
LESRV* T\MTLDGHNLPSLVCVITGKGPLR BY YSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMXWDMT?G 
CCLPVCAVNFKCLHELVKHEE^GLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/ CVTNAMRfibtiADI WYIR 
AVTVYDKPAS F FKETPLDLQHRLFMKLGSMHS P FRARSEP EDPV 
TBRSAFTERDAGSGLVTRIiRERPALLVSSTSWTEDEDFSILLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQKHFQH 
IQVCTPMLEAEDYPLLIX3SADLGVCLHTSSSGLDLPMICVVDMFG 
CCLP VCAVNFKCLHEL VKHE S NGLVFEDS EELAAQLQMLFS N F P 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5871 


3 


3465 

3 

] 
] 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSWKVLERRARTKRS 
VLIO^L*LSLRRL*LEPTI*NGLLT*CSRLSVFRFLKV\GSVYEP 
LKS INLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGLFPTKT 
CGG DQKAKIQDSLYCAAGAWALALAYRRIDDDKGRTHELEHSAI 
KCMRGILYCYMRO^KVQQFKQDPRPTTCLHSVFNVHTGDELLS 
YEEYGHIiQlNAVSLYLLYLVEMISSGLQI I YNTDBVSFI QNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L* KQFNGFNLFGNQGCSWSVI FVDLDAHNRNRQTLCSLLPRESR 
SHNTDAA1LPCISYPAFALDDEVLFSQTLDKVVRKLKGKYGF1CR 
FLRDG YRTS LEDPNRC Y YKP AE IKLFDG TPfFFDTPPT vuvtap 

VFRGNPKQ VQE YQDLLTPVLHHTTEG YP VVP KYYYVPADF VE YE 
KNNPGSQKRFPSNCGRDGKLFIiWGQALYT IAKLLADELISPKDI 
DPVQRYVPLKDQRNVSMRFSNQGPtiENDLWHVALlAESQRLQV 
FLNTYGIQTQTPQQVEPIQIWPQQELVKAYLQLGINEKLGLSGR 
PDRPIGCLGTSKIYRILGKTVVCYPIIFDLSDFYMSQDVFLLID 
DIKNALQFlKQYW^lHGRPLFliVLIREDNIRGSRFNPrLDMLAA 
LKKG r IGGVKVHVDRLQTLISGAWEQLDFLRISDTEELPEFKS 
PEELEPP KHS KVKRQS S TPSAP E LGQQPDVNI SEWKD KPTHB IL 
JKLNDC^CIiASQAIIXGIIXKREGPNFITlOSGTVSDHIERVYRR 
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Amino acid segment containing signal peptide 
(AwAlanine, CoCysteine, D=Aspartic Acid, 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Hi3tidine, I=Isoleucine, K-Lysine, 
^Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y*Tyrosine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«po3sible nucleotide insertion) 








AGSQKLWSVVRRAASLLSK^DSl^APSITWLVQa^QWl^Ap^ 
EEEEVISNPliSPRVIQNIIYYKGNTHDEREAVrQQELVIHIGWI 
ISNNPELPSGTLKIRIGWIIHAMEYHLQIRGGDKPALDLYQLSP 
SE VKQLLLDI LQPQQNGRCWLNRRQI DGSLNRTPTGFYDRVWQI 
LERTPNGII VAGKHLPQQ PTLS DMTMYEMNPSLLVEDTLGNIDQ 
PQYRQI WELIMWS I VLERNPELE FQDKVDLDRLVKEAFNEFQ 
KDQSRLKEI EKQDDMTS FYNTP PLGKRGTCS YIiTKAVMNLLLEG 
BVKPNNDDPCIilS 


5872 


63 


665 


VQGYMYRFVI KINSCYSEXTS I CRHRCCPELPATQPWPTPTVFF 
NIAIDSESLGCI\SFKLFADKV/PKRWKKNFVLLNTGEKVLGDK 
GPCFYRIIPG\LCQGGDFTHHNGTGGKSLYSKEFDDEWPI/LKH 
TAPG VLSTANAGPTTNGS QPFI CTAKTEDG * QHWFGKVKDGMS 
IVEALERSGSRNGECTSKKI TAANCGQL 


5873" 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGFGNAASAR 
HHGLLASARQPGVCHYGTKLACCYGWRRNSXGVCEATCEPGCKF 
GECVGPNKCRCFPGYTGKTCSQDVNECGMKPRPCQHRCVNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCLCP 
SSGLRLAPNGRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCH 
IGF2LQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KOGYKGMGLRCSAI PENS VKEVIiRAPGTIKDRI KKLLAHKNSMK 
KKAKI KNVTPEPTRTPTPKVNIiQ P FNYEE IVS RGGNSHGG\ KKG 
NEEKMKEGLEDEKREEKALKD*HRRERPFRG\ DVFFPKVNEAGE 
FGIjIL WQRKALTSKLEHKADLNI SVDCSFNHG \ I CDW\KQDR\ 
EDDFDW\NPADR\DNAI \GFY\MAVPGLWQGHK\KDIGRLKLLL 
PDLQPQSNFCLLFDYRIiAGDKVGKLRVFVKNSNNALAWEKTTSE 
DBKKKTG KIQL YQGTDATKS 1 1 FEAERGKGKTGE IAVDGVLLVS 
GLCPDS LLS VDD 


5874 


2 


3387 


ACPRLARRRRRVRSLtRKRRGWLRARWSRGQNKMAARRITQETFD 
AVLQE KAKRYHMDAS GEAVSETLQFKAQDLLRAVPRSRABMYDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPSISDD 
SYFRKECGRDLEFSHSNSRDQVIGHRKLGHFRSQDNKFALRGSW 
EQDFGH PVSQESS WSQEYSFGPSAVLGDFGSSRL IEKECLEKE\ 
SRDYDVDHSG\EA\DS VLRGS \SQVQA\RGRALN IVDQEGSLLG 
. KGETQGLLTAKGG VG KLVTLRNVST KKI PTVNR I TPKTQGTNQ I 
QKNTPSPDVTLGTNPGTEDIQFPIQKIPIjGIjDLKNLRLPRRKMS 
FDIIDKSDVFSRFG I E 1 1 KWAGFHTI KDD I KFS QLFQ TLFE LET 
ETCAKMLAS FKCSLXPEMRDFCFFTIKFLKHSAIJCTPRVDNEFL 
NMLLD KGAVKTKNCFFEI I KP FDKY IMRLQDRLLXSVTPLLMAC 
NAYElSVKMKTLSNPLDLAtiALETTNS LCRKSLALLGQTFS IAS 
SFRQEKIL*AVGLQDIAPSPAAFPNFEDSTLFGREYIDHLKAWIi 
VSSGCPLQVKKABPEPMREEEKMI PPTKPE IQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRVI BGSLSPKERTLLKEDPAYNFLSDEN 
SLBYKYYKLKLAEMQRMSENl>RGAD0KPTSADCAVRA2-ttiYSRAV 
RNLKKKLL P\ WQRRGIiLRAQG\ t»RG\ WKARRA\TTGTQTLLFLR 
APGLKHEGRQAPGLS\QAKPSLPDRND\AAKD\CPLDPV\GPSP 

qdpsleasgpspkpagvdiseapqtsspcpsadidmkdngrtae 
klarfvaqvg\peieqf\si\enstdnpdlwfl\hdqnss\afk 
fy\rkkvfelcpsicftssphnl\htgggdtt\gsqespvdlme 
geaefedepppreaelespevmpeeededdbdggebapa\pgrg 
gpslegstpadglpgea\aeddl/algapalftgllqvtcfpfg 

RGFS3KSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPKS 
tuuuu^ iMJUut Ayy KL> \ TDK \NLGFQ\MLQKMGWKEGHGLGSLG K 
G IR\ SRSACTQQAAWGGSGWGLS PS TCSL PLCS FTAKMAYSWQ L 
IFVF 


5875 


296 


1848 


LAALGGLPLWRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA 
LEFSG5LFPHAICIX3DVDProTIiNELVVGDTSGKVSVYKNDDSRP 
WLTCSCQGMLTCVGVGDVCNKGKNLLVAVSAEGWFHLFDLTPAK 
VLDASGBHBTIj IGEEQRPVFKQH I PANTKVMLI S DIDGDGCREL 
WGYTDRWRAFRWEELGBGPEHLTGQLVSLKKWMLEGQVDSLS 
VTLGPLGIiPELMVSQPGCAYAILLCTWKKDTGSPPASEGPTDGS 
/SGDPS CPRRGAAPD I WP Y PQQECLHSPNWQHQT\SHGTES SGS 
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beginning 
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to firat 
amino acid 
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i k »i , ocymcnt containing signal peptide 
<A-Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
HaHistidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine f 
P=Proline, Q=Glutamine D a aw,^ ' 
Sr.Serine, T=Threonine, V=Valine, 
WoTryptc-phan, Y=Tyrosine, X-Unknown, '-Stop 
Codon, /.^possible nucleotide deletion, 
X^possible nucleotide insertion) 


" 5B76 






GL FflLCTLmJTLtKLMEEMEEADKL LWS VQVJDHQLFAIjE KHjDVTG — 
NGHEEWACAWDGQTY1 1 DHNRTWRFQVDENIRAFCAGLYACK 

TACCRSWAWILTTSL*LVPCFTKRSTIQTSHHSVLPQASIIIPPS 
WTCLIAGEGPF»TPTLPPKGVFGSHGAAAG3ITKQ 


5877 


1122 


224 


HLPLGVPSKVAGAAAMKPQEERETQVAAWLKKITODHPI^QYEV 
K PRTTE I LHHLSERNR VRDRDVYLVI E DLKQKAS E YESEAKYLQ 
DLLMES VNFS PANLS S TGSR YLNALVDSAVALETKDTS LASF1 P 
« w «uu x auxir k i KiKi t,b I K IELEKLE KNLTATLVLEKCLQEDV 
KKAELHLSTER\AKVDNRRQNM\DFLKAKSEEFRFGIQAAGEQL 

sargqvdapsvpiqslvalirenwprlkqqtiplkXkklesyld 

LMP\KPSHCSK*RIEEAK\RELA\SIEAELTRRVS\MMEIi 


5878 


2030 


1907 


otlgkmaasssgekekerlggglgvaggnstrerllsaledobv 

LSRELIEMLAISRNQKLLQAGEENQVLELLXHRDGEFQELMKLA 
LNQGKIHHEMQVLEKEVEKRDSDIQQLQKQLKBAEQILATAVYQ 

akeklksiekarxgaisseeiikyahrisasnavcapltwvpgd 

PRRPYPTDLEMRSGLLGQMNNPSTNGVNGHLPGDAtA/RRKIAR 
Ud 1 v & / w^iyM IXjR * INI I Ii I LQKSVCEL 


5879 


9S6 


2113 


glwkcmqlqgphthrvqp^ptprqqgpqWpvaviagnrpnyly 

RMLRSLLSAQGVSPQMITVFIDGYYEEPMDWALFGLRGIQHTP 

is iknarvsqhykasltatfnlfpeaecfawleedldiavdffs 

FLSQS I HLLBEDDSLYCI SAWNDQG YBHTAED PALLYRVETMPG 
LGWVLRRSLYKEEl.EPKWPTPEKLWDWDMVfMRMPEQRRGRECII 
PDVSRSYHFGIVGLNMNGYFHEAYFKKHKFNTVPGVQLRNVDSL 
KKEAYEVEVHRLLSEAEVLDHSKNPCEDSFLPDTEGHTYVAFIR 
MbK^UD^I^WTQLAKCLHIWDLDVRGNHRGLWRLPRKKKHFLVV 
GVPASPYSVKKPPSVTPI FLEPPPKEEGAPGAPEQT 


5880 


3 


981 


l^TBAAAAGSGi>RAAGWAG^PP'i%LPLSPTSPRCAATMASSDBD " 

GTNGGASBAGEDREAPGKRRRLGFLATAWLTFYDIAMTAGWLVL 

AIAMVRFYMEKGTHRGLYKSIQKTLKFFQTPALLEIVHCLIGIV 

PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEBSWLFLVAWTVT 

EITRYS FYTFSLLDHLPYFI KWARYNFPI HYPVGVAGELLTI Y 

AALPHVKKTOMFSIRLPNKYNVSFDYYYFLLITMASYIPLFPQL 

YFHMLRQRRKVLHG\G*L*KRMIK*SLQTRCFFQNNQDYLSPSF 
NNKNKQLCEIS W I VWFLKI 


5881 


1138 


1324 


o unv.b v*\w*uij1A,PSS QNPLQKAU X liASPREARGT FS ALTACSA - 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVLSD 
* KKKRGRCSS/WIiSQPQHEREKEWLLRRSMAEGERARAASDVL 
CRS1^ETHQLRRTLTATAHKCQHLAKCLDERQHAQRNVGER5 P 
DQSBHTDGHTSVOSyiEJCT^npPMDT t vt»vt7ttt» rc>T>T 
ic * v va ✓ *&»JjVJi£j^KJ^ljliyKvTHVEDLNAKWQRYN 

ASRDE YVRGLHAQLRGLQI PHEPELMRKEISRLNRQLEEKINDC 
A2VKQELAASRTARDAALERVQMLEQQI LAYKDDFMS ERADRER 
AQS R IQELE EKVASLLHQ VS WRQ DSREPDAGRIHAGS KTAKYLA 

ADAEaELMVPGGWRPGTGS QQP E PPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEEXiLRHVAECCQ 


5882 


26 


441 


GGIHP^FTi^HAQHLTMDCTWRltF^VAAATGTHAQVQLLQSG " 
SBVKKPGASVMVSCYVSGYTLTKLSMfnWRQAPGKGLE*MGPFD 

lqdvetiypqkfqgrvsmteetstettq/aylelsslrsedtav 

HHCATDTV 




2407 


2216 

] 
1 
1 

J 

e 

1 
I 
I 


SGCVEMLYSHSiEYNPEWISVOSAVAPAQLALNSDGDL^ljISG^ 

RTRRD*QLP3AGGPGLQEPLQLGELDITSDEFILDEVDG\VDLR 

SYSKQVELELC^IEQKSIRDYIQBSEKIASLhnqixaCDAVLER 

^EQMLGAFQSDLSSISSEIRTLQEQSGAMNIRLRNRQAVRGKLG 

2LVDGLWPSALVTAILEAPVTEPRFLEQLQELDAKAAAVREQE 

UlGrAACADVRaVLDRLRVKAVTKlREFILQKIYSFRKPMTNYQ 

CPOTALLKYRFFYQFLI^NERATAKEIRDEYVETLSKIYLSYYR 

JYLGRiMKVQYEEVAEKDDLMGVEDTAKKGFFSKPSLRSRtOTIF 

^I*G TRGS VIS PTEI*EAP ILVPHTAQRGEQRYPFEAL FRSQH YAL 

iDNSCREYLFICEPFWSGPAAHDLFHAVMGRTLSMTLKHLDSY 

^CYDAIAVFIjCIHIVLRFRNIAAKRDVPALDRYWEQVLAIiLW 
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Amino acid segment containing signal peptiHS"- 
<A=>Alanine, C=Cysteine, D-Aspartic Acid. E= 
Glutamic Acid, F= Phenyl a la nine, G=Glycine, 
HsHistidine, I=lsoleucine, K=Lyeine, 
L=Leucine, Methionine, WoAsparagine 
P»Proline, Q=Glutaraine, R-Arginine, 

SnSerine . T=Th*"Ortri , » IT j_ 

W=Tryptophan, Y-Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


^883' 






PRFELILEMNVQSVR3TDPQRLGGLDTRPHYlTRR!irAEPSSALV- 
SINQTIPNERTMQLLGQLQVEVENPVLRVAAEFSSRKEQLVPLI 
NNYDMMLGVIiM\E* BRAADDSKEVBSFQQLIjtfARTQEFlEELLS 
PPFGGLVAPVKEAEAL IERGQAERLRGEBARVTQL IRGFGS S WK 

SSVBSLSQDVMRSFTNFRNGTSIIQGALTQLI0\LYHRFHRV\L 
_ SQPQIiRALPARAELINIHHLMVBLKKHKPNF 


5684 


2 


1374 


EFPGRRFRAVMEAGAGAGAGAAGWSCPGPGPfrv'TtLGSYBASBG 
ucxawjUK^gslerrgMQAMEGEVLLPALYEEEEEBSEEEEEVE 
EEEEQVQKGGSVGSLSVKKHRGLSLTETELEELRAQVLQLVAEL 
EETRELAGQHEDDSLELQGLL2DERLASAQQAEVFTKQXQQL0G 
ELRSLRE E IS LLEHEKESEL KE I EQE LHLAQAE I QSLRQAAED S 

ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEMB 
MKSS EPSGS LGLSD YSGI<3 EELQELRERYH FLNE E YRALQESNS 
SLTGQLADr.BSERTQRATERWLQSQTLSMTSAESQTSEMEFLEP 
D PEMQLLRQQDRDAE EQMHGMKNKCQELCCELE ELQHHRQVS EE 
EQRRLQREI#KCAQNE VLRFQTSHS \SPSHPLPPI PPSS PCLL * A 
LWISALLWCWWAETSS 


5885 


4261 


2522 


GVLARASARLRVPLTGVRACAEPEVGAE pakvagaaepdedggr 
SRLRDCGDYTPSERLGPKGAMLWFQGAI PAAI ATAKRSGAVF VV 
FVAGDDBQSTQMAASWEDDXVTEASSKSFVAIKIDTKSEACLQF 
SQIYPWCVPSSF?IGDSGIPLEVIAGSVSADE1»VTRIHKVRQM 
HLLKSETSVANGSQSES S VST PSAS FBPNNTCENSQSRNABLCE 
IPSTSDTKSDTATGGESAGHATSSQBPSGCSDQRPAEDLMIRVE 
RLTKKr.EERRBEKRKEEEQRHIKKEIERRKTGKEMLDYKRKQEB 
ELTKRMLEERNREKAEDRAARERIKQQIAIJ3RAERAARFAKTKB 
EVEAAKAAALUVKQAEMEVKRESYARERSTVARIQFRLPDGSSF 
TNQFPSDAPIiEEARQFAAQTVGNTYGNFSLATMFPRREFTKEDY 
KKKLIjDLEIiAPSASW1j1>P/ ALFINF*AGRPTASrVHSSSGDIW 

TliLGTVLYPFLAIWRLISNFLFSNPPPTQTSVRVTSSEPPNPAS 

SSKSEKREPVRKRVLEKRGDDFKKEGFaYRLRTQDDGEDENNTW 
NGNSTQQM 


S86£ " 


900 


467 


AAGGGRkSRLSKSWPTGPSKSPSGVRCCcARR\AWEDKDEFLDV"- 
IYWFRQIIAWLGVIWGVLPLRGFLGIAGFCLINAGVLYLYFSN 

ylqidebeyggtweltkegfmtsfa/ivhghldhllhchpl*lm 

VYSSQVLPIQSKGPS 


5B87 


es 


1341 


1 1 a^tuiij ILXJUJ PRK, VAF^SLGTCH KSDPGRPAAQSQPPSPGS - 

gtfgllsfri^tktwtlkkhfvgyptnsdpelktselpplkng 

EVLLEALFLTVDPYMRVAAKRLKEGDTMMGQQVAKVVESKNVAL 
PKGTI VIAS PGWTTflSISDGKDLEKIiLTEWPDTI PLSLALGTVG 
MPGLTAYFGLLEICGVKGGETVMVKAAAGAVGSVVGQIAKLKGC 

kwgavgsdekvaylqklgfdwpnyktveslbetlkkaspdgy 

DCYPDNVGGEFSNTVIGQMKKFGRIA1CGAISTYNRTGPLPPGP 

PPEIGIYQELRMEAFWYRWQGDARQKALKDLLKWVLELPYFVI 

D*LQANTIiVYKSMKSAKPSLEYISEKliVSG\KIQYKSYIlEGFE 
KMPAAFMGMLKGDNLGKTIVKA 




1337 


104 

- 

J 
] 
] 
] 
3 
I 

r 

\ 


APGCRG CRATRCP CRGPR WDS LGDEAARS PAAPGGAPGLLGIjRE 
RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGL^GLQGPP 
PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 
ACS VP WTGDSQFCS QKAV1 YSLNFTANPPQRVFE LVDQINP S I 

FClHITN\*NLHYPLLIQKYL/NENNFDTLMKrrSDGFTLNAESY 
VS FTTKLDI PTAAKYE YGVPLQTS DSFLR F PS SLTS SLCTDNNP 

WLVNQAVKCTRKINLEQCEEIEALSN1AFYSSPE1LRVPDSRK 
<VP ITVQS I VIQSLNKTLTRRED TDVLQP TLVNAGHFS L CVNW 
lEVKYSLTYTDAGEVTKADLSFVLGTVSSWVPLQQKFEIHFIiQ 
3NTQPVPLSGNPGYWGLPIAAGFQPHKGSGIIQTTNRYGQLTI 
jHS TTEQDCLALEG VRTP VLFG YTMQSG CKLRLTGAL PCQLVA Q 
CVKSLLWGQGFPDYVAPFGNSQGP/ADMLDWVPIHPITQSFNRK 
)S CQLPGALVTEVKWTKYGSLLNPQAKI VNVTANLISSSFPEAN 
GNERTI L I S TAVTFVDVSAPAEAG FRAPPAtNARLPFNFFFPF 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid' 
sequence 


Amxno acid segment containing signal peptide 
{A=Alanine, C= Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Aeparagine, 
P=Proline, Q=G1 ut amine , R=Arginine, 
S-Serinc, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X-Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, ' 
\»possible nucleotide insertion) 


5888 


375 


2302 


MiCRT PGVAMQRADS EQPS KR PRCDDSPRTPSNT PS AEADWS PG™~ 

LELHPDYKT WG P EQVCS FLRRGGFEE PVLLKN I RENE I TGALLP 

CliDESRFENLGVSSLGERKKLIiSYIQRLVQlHVDTMKVINDPIH 

GHI ELHPL I»VR I IDTPQFQRLRYI KQ LGG GY YVFPGASHNRFEH 

SLGVGYLAGCLVHALGEKQPELQISHRDVLCVQIAGLCHDLGHG 

PFSHMFDGRFIPLARPEVXWTHEQGSVMMFEHLINSNGIKPVME 

Q YGL I PEED I CFI KEQ I VG PLES P VEDS LWP YKGRPENKS FLYE 

I VSNKRNG I DVDKWD YFARDCHHLG I QNN FDYKRFIKFARVCE V 

DNELR1 CARDIGi: VGNLYDMFHTRNSLHRRAYQHKVGNI IDTMIT 

DAFLKADDYIEITGAGGKKYRISTAIDDMEAYTiCLTDNXFLEIL 

YSTDPKLKDAREILKQIEYRNLFXYVGETQPTGQIKIKREDYES 

LPKEVASAKPKVLLDVKLKAEDFIVDVINMDYGMQEKNPIDHVS 

F YCKTAPNRAI R I TKNQ VS QLL P \ EKFAEQ\ L IRVYCKKVDR KS 

LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 

NDSTFSPKIPTRLPRRLPKSRV\QLFKDDPM 


5889 


1831 


731 


L PAACGR P VTAR PRQAPEGRS GRPRDIOPYPPQVFPPRPDR VAI 
VTGGTDGIG YS TAKHLARLGMHVI I AGNNDS RAKQ WSKI KEET 
LNDKET*VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLLK 
tX5IFIL\DIASMTSIRQFVQKFKMKKI?LHVLINNAGVMMVPQR 
KTRDGFEEHFGLHYLGHFLLTNLLLDTiiKBSGS PGHSARWTVS 
SATHYVAELNMDDLQSSACYSPHAAYAQSKLALVLFTYHLQRLi, 
ATVEGSH^/TANWDPGVVNTDLYKKVFWATRLAKKLLGWLLFKTP 
DEGAWTS I YAAVTP ELEG VGGRY L YNKKBTKS LHVTYNQKLQQQ 
LWSKSCEMTGVLDVTL 


J 5890 

i 
\ 

i 


1322 


200 


FRRG WS AAGRAVP VAF CS R X SAS S PRR PRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTV3AILTCP 
LEWKTRLQSS S VTL Y I S E VQLN1WJVG AS VNRWSPGPLHCLJCV 
I LEKEG PRSLFRGLGPNLVGVAPSRAI YFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAITATNPI WLIKTRLQL* /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSAS YAGISETVIHFVI YES I 

kqklleyktastmekdeesvkeasdfvgmmlaaatsk\lvatt r 

AYPHBWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQ I P \NTAIMMAT YELWYLLNG 


5891 


1322 


200 


FRRGWSAAGRAVPVAFCSR3 SASSPRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAIT.TCP 
LEWKTRLQ S S SVTL YISEVQLNTMAG AS VN R WS PGPLHCL KV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMI SAAMAGFTAI TATNPIWLI KTRLQL * /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSAS YAG r SETVIHFVI YES I 
KQECLLEYKTASTMENDEESVKEASDFVGMMLAAATSK\LVATTI 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQI P \NTAIMMAT YELWYLLNG 


5692 


1764 


379 


WLRVCGRLS VNSAVSS RTGGWS AGLTCAMQRLQWtjGliLRGPA 
DSGW^PQAAPCLS G APHASAADVVVVHGRRTAI CRAGRGGFKDT 
TPDELLS AVMTAVLKDVNLRPEQLGDI CVGNVLQPGAGAIMAR I 
AQFLSDXPETVPLSTVNRQCSSGLQAVASIAGGIRNGSYD1GMA 
CGVESMSIiADRGNPGNITSRLMEKEKARDCLtPMGITSENVAER 
FGISREKQDTFALASQQKAARAQSKGCFQABrvPVTTTVHDDKG 
TKRSITVTQDSGIRPSTTMEGLAKLKPAFKKDGSTTAGNSSQVS 
DGAAAILLARRS KAEELGLP I LGVLRSYAWGVPPD IMG IGPAY 
AIPVALQKAGLTVSDVDIFEINE\AFASQAAYCVEKLRLPP*EG 
* TP LGGAS G P * GH PLG LHWGHVQ V I TLAQ * S * S ARG KRAYRSGC 

l*WVi.(j£>W£1lab PLitfV r £ x PWGT 


5893 


3 


1653 


ILSKRRCQKAKTKELMAKKVAVI GAGVSGL ISLKCCVD EGIiEP T 
CFERTED I G GVWR FKENVE DGRAS I YQSWTNTS KEMS CFSDFP 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 
DFSSSGQWKWTQSNGKEQSAVFDAVMVCSGHHILPHIPLKSFP 
3MERFKGQ YFHSRQ YKH PDGFEGKRI LVIGMGNLGSD IAVE LSK 
VAAQVFISTPJrtGTWVNSRISEDGYPWDSVFHTRFRSMLRNVLPR 
rAVKWMIEQQMNRWFNHENYGLEPQNECYIMKEPVLNDDVPSRliL 
ZGAI KVKSTVKELTETSAI FEDGTVEENIDV1 IFATGYSFS FPF 
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SBQ 

ID 
NO: 


rrcuibtcu 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
ano.no aciu 
residue of 
amino aaid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, CaCysteine, D»Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HaHistidine, I=*Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
Sparine, TVThreonine, VoValine, 
W-Tryptophan, YeTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEDSLVKVBW^SLYlCyiFPAliLDKSTrACIGLlOPI.G'Sypirf" 
AE LQ ARWVTRVFKGL CS h PSERTMMMD I 1 KRNEKR I DLFGES OS 
QTLQTMYVD YLDELALE I GAKPD FCS LLFKDP KLAVRLY FGP CN 
S Y* YRLVGPGQWEGARNAIFTQKQRI LKPLKTRALKDSSNFSV3 
FLLKILGLLAVWAFF\ CQLQWS 


5894 


174 


1573 


RYS PKKVLQNKESSLKLGMATALVS AHSLAPLNLKKEGLRWRE 
DHYSTWEQGFKLQGNSKGLGQEPIjCKQFRQLRYEETTGPREALS 
RLRELCQQWLQPETHTKEHILELLVLEQFLIILPKBLQARVQEH 
HPESREDWWLBDLQLDLGETGQQVDPDQPKKQKILVEEMAPL 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLIWTDSCGRVBS 
SGKISEPMEAHNEGSNLERHQAKPKEKIEYKCSEREQRFIQHI>D 
LIEHASTHTGKKLCESDVCQSSStiTGHKKVLS * ERKVIQC\HGV 
LGKAFQRS3HLVRHQKIHLGEKPYQCNECGKVFSQNAGLLEHLR 
3HTGEKPYLCIHC3GKNFRRSSHLNRHQRIHSQEEPCECKECGKT 
FSQALLLTHHQRIHSHSKSHQCNECG JCAFS LTS DL ZRHHRIHTG 
BKPFKCNI CQKAFRLNSHIiAQHVR I HNEEKP YQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


5995 




86 


HPSLLGAIi'FYPPPSSPWPPPLYLFNNSHRKSRHFItfQRGIHGB " 
KRLF7SDG\TGCLPVLAAAGRARGRAEVI,IS^VGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLBW 
EATE^QPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 

rqXncpflagetesladivlwgalypllqdpaylpeblsalhsw 
fqtlstqXepcqrVaarrlvlkqXqgvwvlrXpylqkqpqpspa 
egkglspiepeseelatlseeeiamavtawekgleslpplrpqq 
npvlpvagernvlitsalpyvnnvphlgniigcvlsadvfarys 
rlrq^^ntlylcgtdeygtatetkal\eegltpqeicdkyhiiha 

DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVE qlrcehcarf\ladrfveg VG PFCGYEEARGDQCDKCG KL I 
navelkkpqckvcrscpwqssqhlfldlpklekrleewlgrtl 
pgsdwtpkaqfi tpffgfrewpskprwq*trdlk\ wgnpgt p * e 
gfedk\vfyvwfdatigylsitanytdqwervw\knpeqvdlyq 
fm \akdnvpfks lvfpssalgaednytl\ vshli ateyln yedg 
k\fsksrgvgvfrdm\ahdtgippdisrfyl\lyirpegk\dsa 
fs wtdllucnns \ellnwlgnfinra\gmfvskffgg\ yvpemv 

LTPDDQRLLA\HVTLELQIIYIIQ\LLEKVRIRDAliRSILTIS\RH 
GNQYI \ Q VNE P W\KR I KGS EADRQRAGTVTGLAVN I AALLSVML 
QPYMPTVSAT IQAQLQLPP PACSILLTNFLCTLPAGHQIGTVSP 
LFQ KL ENDQ I E S LRQR FG G GQAKTS PKPA WETVTTAKP QQ I Q A 

LMDEVTKQGNIVRELKAQKADKNBVAASVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5896 


2967 




hpsllgaipfypppsspwppplylfwnshrksrhfinqrgihge 
mrlfvsdgvpgclpvlaaagrargraevlistvgpedcwpflt 
rpkvp vlqldsgnylfs tsaicry ff\l*i*sg weqddltnqw lew 
eatelqptlsaai*yyl\wqgkkg\edvlgsvrrtlthidhsls 
rqnncpflagetbs iad i vlmgalypllqdpayiipeels alhs w 
fqtiistq\epcqr\aarrlvijkq\qgvlalr\pyijqkqpqpspa 
bgkgi^piepeeeelatlseeeiamavtawekgleslpplrpqq 
wpvlp vagernvl itsal pyvnnvphlgni igcvls advfar ys 
rlrqwntlylcgtdeygtatetkal \eegltpqe i cdkyh i iha 
diy\rwfnis fdi fgrtttpqq\tki t\qdifqqllkrgfvlqd 
tveqlrcehcarf\laerfvegvcpfcgyeeargdqcdkcgkli 

NAVELKJCPGCKVCRSCPWO^SOHT. ft.tiT .pkt.t? vq t .ritut odtt 

PGSDWTPNAQPITPFFGFRBWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
F SWTDLLLKNNS \ ELLNNLGNF INRA\GM FVSKF PGG \ YVPEMV 
LTPDD QRLLA\HVTLELQHYHQ\ LLEKVRIRDALRS I LTIS \ RH 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNI AALLS VT^IL 
QPYMPTVSATIQAQLQLPPPACS ILLTNFLCTLPAGHQIGTVS P 
LFQKiENDQ IRSLRQRFGGGQAKTSPKPAVVET VTTAKPQQ I QA 
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ID 

NO: 


beginning 

nucl pnf "i Hf» 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
j. oca uion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, i«Isoleucine, K=Lysine, 
LsLeucine, MaMethionine, N=Asparagine, 
PoProline, Q=Glut amine, R=Arginine, 
3=Serine, T-Threonine, V-Valine, 
W=Tryptophan, Y-Tyroainc, X=Unknown, *=stop 
Codon, /"possible nucleotide deletion, 
\-possible nucleotide insertion) ! 








LMDE VTKQGNI VRELKAQ KAD KNEVAAE VAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5897 


2967 




HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRH^IIIQRGIEGE 
^LFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAI CRYFF\LI«S GWEQDDLTOQWLE W 
EATELQPTLSAALYYL\VVQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ \NCP FLAG ETESLAD IVLWGAL YPLLQD PAYLPEELSALHS W 
FQTLiSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYXiQKQPQPSPA 
EGKGLS P IE P EEEELATLS E E E I AMAVTAW EXGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPyVNNVPHLGMI IGCVLSADVFARYS 
RLRQWNTLYLCGTDBYGTATETKAL\EEGLTPQEICDKYHIIIIA 
D I Y\RWFNI S PD I FGRTTTPQQ \T KI T \ QD IFQQLLKRGFVLQD 
TVEQLR CEHCARF \ LADRFVEGVCP FCG YE EARGDQCDKCX3 KL I 
NAVSLKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPS KPRWQ * TRDLK\WGNPGTP * R 
GFEDK\ VFY VWFDATIG Y LS I TANYTDQWER WW \KN PEQVDL YQ 
FM\ AKDNVP PHSLVFPSSALGAEDNYTL\ VSHLIATEYLNYBDG 
K\ FSKSRG VG VFRDM \AHDTG I PPDISR FYL\ L YI R PEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVS KFFGG \ YVPEMV 
LTPDDQRLLA\HVTLEX,QHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNEPW\ KRIKGSEADRQRAGTVTGLAVNIAALLSVML 
QPYMPTVSATIQAQLQXtPPPACSILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
LMDBVTKQGNIVRELKAQKAnKNEVAAEVAKTtljDLKKQIjAVAEG 
KPPEAPKGKKKK 


5898 


29S7 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE~ 
MPXFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\I*LSGWEQDDLTNQWLEW 
EATEIiQPTLSAALYYIi\ WQGKKG \ EDVLGS VRRTLTHI DHSLS 
RQ \NCPFLAGBTES LAD IVLWGALYPLLQD PAYLP EELS ALHS W 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVIiALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEBLATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLI TSALP YVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
DIY\RWFNISFDIPGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQ CKVCRS CPWQSSQHLFLDLPKLEKRLE EWLGRTL 
PGSDWTPNAQFITP FFGFREWPS KPRWQ * TRDLK\WGNPGTP * E 
GFEDK\VFYVN FDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM\ AKDNVPFHSLVFPS S ALGAEDNYTL \VSHL IATE YLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYTRPEGK\DSA 

fsotdllliows\eij^i^finra\gmfvskffgg\yvpemv 
ltpddqrlla\hvtlelqhyhq\llbkvrirdalrs iltis \rh 

GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVKL 

qpymptvsatiqaqlqlpppacsilltnflctlpaghqigtvsp 
lfqklendqieslrqrfgggqakts pkpawetvttakpqqiqa 
lmde vtkqgni vrelkaqkad knevaaevaxlldlkkqlavaeg 
kppeapkgkkkk 


5899 " 


326 


107B 


NCPKSKE PNGVRAPSLPSPLRAAMALSDVDVKKQIKH^!MAFI EQ " 
EANEKAEBIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQKKILKSTMRNQARLK^RARNDLISDLLSEAKLRLSRIVEDP 
EVYQGLLDKLVLQGLLRLLBPVMI VRCRP \ ODIiLLVEAAVnTra T 

PEYMTISQKHVEV\QIDKEA*LAVECSWEWJEVYSGNQRIKVSN 
TLESRLDLSAKQKMPE IRMALFGANTNRKFFI 


5900 


64 


1409 


KAAS RDS P CLk FCP LCGVS SHDLQHRMWYHRLSHLHSRLQDLLK 
GGVIYPALPQPNFKSLLPLAVHWHHTASKSLTCAWQQHEDHFEL 
KYANTVMRFDYVWLRDHCRSASCYNSKTHQRSLDTASVDLCI KP 
KTI RLDETTLFFTWPDGHVTKYDLNWLVKNS YEGQKQKVXQPR I 
LWNAE I YQQAQVPS VDCQS FLETNEGLKKFLQNFLLYGIAFVEN 
VPPTQEHTEKLAERISLIRETIYGRMWYFTSDFSRGDTAYTKLA 
IJ3RHTDTTYFQEPCGIQVFHCLKHEGTGGRTLLVDGFYAAEQVL | 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rccuicuca C71Q 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Alanine, c=*Cysteine, D^Aapartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
K-Histidine, I=Isoleucine , K=Lysine, 
jj— ucul xiit; , pi-nRr.nionine, N«=Aspaxag j_ne , 
P= Pro line, Q=Glutaraine, R=Arginine, 
S»Serine, T*Threonine, VaValine, 
W-Tryptophan, Y»Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\aposaible nucleotide insertion) 








QKAPBEFELLSKSAI\KHBYIEDVGECHQPHDWDWAQS*iS , TriG 
/YKELYLIRYNNYDRAVINTVPyDVVHRWYTAHRTLTIELRRPE 
NEFWVKLKPGRVLPIDNWRVLHGRECFTGYRQLCGCYLTRDDVL 
NTARLLGLQA 


$901 




2121 


VAIEQTSLKMKQAVGGAPARPTGErYICNQOGAKYTSLDSFQfHlT" 

KTHLDTVLPKLTCPQCNKEFPNQESLLKHVTIHFMITSTYYICE 

SCDKQFTSVDDLQKHLLDMHTFVFFRCTLCQEVFDSKVSIQLHL 

VAVTOiSNEKKVYRCTSO^FRNEriTlLQLHVKHNHLENOaKVHK 

CIFCGBSFGTEVELQCHITTHSKKYNCKFCSKAFHAI ILLEXHI* 

REKHCVFBTKTPNCGTMGASEQVQKBBVELQTLLTNSQESHNSH 

DGS EEDVDTSEPMYGCDlCXSAAYTWETLLQNHQIiRDHMI RPGES 

AIVEaCKAELIKG^^XTCWCSRTFFSENOLREHMQTHLGPVKEYM 

CPICGERFPSLLTLTEHKVTHSKSLDTGNCRICKMPLQSEEBFIi 

EHCQMKPDLRNSLTGFHCWCMQTVTSTLELKIHGTFHMQKTGN 

GSAVQTTGRGQHVQKLYKCASCLKEFRSKQDLVKLDINGLPYGL 

CAGCVNLSKSASPGINVPPGTNRPGLGQ3STENLS Al EGKGKVGGL 

KTRCS * LATFKF * VL KVELPE PHPKPFHRGVS RPDSNS TQLKT P 

QVS PMPR IS PSQSDE KKTYQC IKCQMVFYNE WDI Q VHVANHKI D 

EGLNHECKIiCSQTFDSPAKI^QCHLIEHSFEGMGGTFKCPVCFTV 

FVQANKLCXJHIFSAHGQEDKIYDCTQCPQKFFFQTELQNHTMTQ 

HSS 


5902 


712 


209 


LKNRRRSRPS IRQSIGSTSVSRWLTSLFTYLDHTADVQ * V* RBF 

ipl:<prq*ed*mfqswlhawgdtleeafeqcamawfgymtdtgt 

\TBPLQTVEVBTQGDDLQS LLFHFLDEWLYKFSADEFFI P \GWGE 
B FS LSKKPQGTEVKAI TY5AMQVYNEEN PE VF VI ID I 


5903 


2106 


73 5 


dtpgpslpsttapfslrslsfpsrpsyllpgdpqplqgrglptt 1 

PALFALS AVPGGAAS PMP PSGLRLLPLLLPLIjWLLVIjTPGRPAA 
GLSTCKTIDMELVKRKRIEAIRGQII*SKLRLASPPSQGEVPPGP 
LPEAVLALYNSTRDRVAGESAEPEPEPEADYYAKEVTRVLMVET 
HNEIYDKFKQSTHSIYMFFNTSBLREAVPEPVLLSRAELRLLRL 
KLKVEQHVBL YQKYSNNS WRYLSNRLLAPSDS PEWI»S FDVTGW 
RQWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGR\RGDL 
ATIHGMNRPFLLLMATPLERAQHLQS \SRHR0AL\DTNY\ CFSF 
HGGRNCLRC/VHC*HLIFRKDL\GW\KWI\HE\PKGYHANFC\L 
GPCPYIWSLDTQYSKVLAliYNQ\HKPG\ASAAP\CCVPQALEP\ 
LPIVYr \VGRK?KVEQLSNMIVRSCKCS 


5904 


3 


1126 


MMEEIENAIWTFKEEQRLIYBELIKEEKTTNNELSArSRKIDTW 
ALGNSETEKAFRAISSKVPVDKVTPSTLPEEVLD FEKFLQQTGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKES I QI WKTKKQQKREEI FKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKLAVEAWKKQKS I EMSMKCASQL 
KBEEEKEKKHQKERQRQFKLKLLLESYTQQKKEQEEFLRLEKBI 
RE KAEKAEKR KNAADE IS R FQERDLHKLELKI LDRQAKEDEK3Q 
KQRRLAKLKEKVENNVSRD PSRLY/NTHQRLGRTNQKDRTNRLW 
ATSTYPT*GYSNLBTRNTEKSMR 


5905 


287 


2912 


MAS FPPR VNEKEI VRLRTIGELIAPAAPFDKKCGRENWTVAFAP 
DGSYFAWSQGHRTVKLVPMSC^QNFLIiHGTKNVTNSSSLRLPR 
QNS DGGQKNKPREHI IDCGD I VWS LAFGS SVP EKQSRCVNI EWH 
RFRFGQI^LLIATGLNSGRIKIWDVYTGKLLLNLVDHTGVVRDI* 
TFAPDGSLI LVSAS RDKTLRVWDLRDDGN\MMKVLRGHQK WVY \ 
SCAFS PDSSMLCS VGASKAWAAILV* URLCWHHSHTGATMVLS 
WAE RVASLATGLGATFTI G *SNLAFVLQG VLYVHRCWSMSTFCF 
S FFLF FFFKVIS PTVKYH * LLSKL I FQFYGIGSLTSE TNLM * S I 
WLSNGFS VL FFG I LSDSRDI LRL* FNLKFVLI FF * K* CI VS VQK 
KKKPKR I ALLQEERLS *DKPPSSHI*I *QTEVNIRI IiFRAI LHS * 
IiLI FR I * NC I * TYS * I IDPF YI QMT YDRG*FGKNKMVKF* F IEM 
*LYYFHKIAFSFCNVV*HPCCI»PKKFHLAVNII*FACSICFSS*A 
QVGDPSLL*TSDYIiKGRCQWSNNLLTLRFLSVYFFKNLWSGKK 
REGGL*YLTLFISVYFS*LVFGINGF«YSFWKLHCLYFMFRLI 
FKLTFNRNI*NRICMSALINLKTDFNLTMTLS1FFKLLI IYNA* 
YNLN*I*QF*YKMCHFVI*CMSE*SYNICI#FIAGF\LWNMDKYTM 
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SEQ 
ID 
NO: 


1 Predicted "~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 

Glutamic Acid, ^Phenylalanine , G-Glycine, 

HaHistidine, I=Isoleucine, K=I>ysine, 

L= Leucine . M—Mp h Hi nr> irif» NQ&enaMnino 
** ** w ***-^»*^' r n— «cu»tAwmAuC| A" = «6paragme , 

P«Proline, Q«Glutamine, IteArginine, 

S*Serine, T=Thxeonine, V=Valine, 

W=Tryptophan, Y=Tyrosine, XaUnknown, *»Stop 

Codon, /=possible nucleotide deletion, 

\»possible nucleotide insertion) 








I RKLEGHHHDWACDFS PDGALLATAS YDTRVYI WDPHKfGD ILM 
EFGHLFPPPTPIFAGGANDRWVRSVSFSHDGLHVASLADDKMVR 
FWR I DED Y PVQ VAPI*SNGLCCAFSTDGS VIi&AGTHDGS VYTTWAT 
PRQVPSLQHLCRMSI RRVMPTQE VQELP I PS KIiLEFLS YR I 


5906 


146 


2038 


REGAGSGRMASGA\YNPYI£IIEQPRQRGMRFRYKCEGRSAGSl 
PGEHSTDNNRTYPSIQIMNYYGKGKV\RITLVTK\NDPYKPHPH 
DLVGiCDCRD\GYYEAEFGQE\RRP\LFFQN\LGIRCVKKKEVKE 
A\ I ITR\ I KAG INPFDV?*KQLNDI EDCDLDWRLWFRVFLPDG 
KGN1>\ TTALPP V\ VSS P I YDNRAPNTAELR VCRVNKNCGS VRGG 
DEIFLLCDICVQKDDI EVRFVLNDWEAKGI FSQADVHRQVAIVFK 
TPPYCKAITEPVTVKMQLRRPSDQBVSESMDFRYLPDEKDTYGN 
KAKKQKTTLLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QSAGITVNFPERPRPGLLGSIGEGRYFKKEPNLFSHDAVVREMP 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLS S FSTRTL PSNSQG I PPFLRIP VGNDLNASNACI YNN 
ADDIVGMEASSMPSADLYGISDPNMjSNCSVNMMTTSSDSMGET 
DNPRLLSMNL ENPSCNSYLDPRDLRQ LHQMS SS S MSAGANSNTT 
VFVSQSDAFBGSDFSCADWSMINESGPSNSTMPNSHVFVQUSQY 
SGIGSMQNEQLSDSFPYEFFQV 


5907 


99 


1373 


TYbLSS WSS * 'NLDTKIKSQVKV7RKGHKKISWP VpC>PAKQNGK' " 
KATSKVPSAPHFVHPNDHANREAELKiaCWVBEMREKQQAAREQE 
RQKRRTIESYCQDVLRRQEEFEHKEEVLQELNMFPQLDDEATRK 
AYYKEFRKWEYSDVILEVLDARDPLGCRCFQMEEAVLRAQGNK 
KL VL VLNK IDLVPKEWE KWLD YLRNE tiPT VAFKASTQHQVKNL 
NRCSVP VDQASBSLLKS KACFGAENLMRVLGNYCRLGBVRTHIR 
VGWGLPNVGKSSLINSLKRSRACSVGAVPGITKFMQEVYLDKF 
IRLIiDAPGIVPGPNSEVGTILRNCVHVQKLADPVTPVETILQRC 
NLEEI SNYYGVSGFQTTBH FLTAVAHRI»GKKKKGGLYSQEQAAK 
AVLADWVSGKI S FYIPP PATHTLPTHLSAE1 VKBMTBVFDIEDT 

ENKTTVYKIGDLTGYCTNPNRHQMGWAKRNVDHRPKSNSMVDVC 
SVDRRSVLQRIMETDPLO^GQAIiASALKNKKKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


5908 


247 


975 


HCGI KKRGEGS G S PS PASGG FQLGCQ I P 3PSLPS EE ETHPHTRA 
HTRTLRATLTRRPPRSHSTRLRFPMPLDGDGGLASWX/PMRER* 
GWRR P AKAAG ASLGVAATG KRGCRMS KRYLQKATKGXLLI 1 1 F I 
VTLWGKVVSSANHHKAHHVKTGTCBWAiHRCCNKNKIEERSQT 
VKCSCFPGQVAGTTRAAPSCVDASIVEQKWWGHMQPCLEGEECK 
VLP DRKGW S CSS GN KVKTTRVTH 


£46? 


1 


5002 


PAI PGSTI I WAPGSHSAARADGRHGS LPS QSQAPGALCGARAPP " 
SSNLRADRSMICAQARAGKNLYHNRFLGLAAMAFPSRNS QSLRR 
CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTSDSR 
STLMGRSS YYS IGHSQDLVIHWDI KEEVDAGDW IGMYL I DEVLS 
ENFLDYKNRGVNGSHRGQI I WKI DAS S YF VEPETKI CFKYYHGV 
SGALRATTPSVTVKNSAAPIFKS IGADETVQGQGSRRLISFSLS 
DFQAMGLKKGMFFNPDP YLKIS I QPGKHS I FpALPHHGQERRSK 
I IGNTVNP I WQAEQFSFVSLPTDVLE IEVKDKFAKS RP I IKRFL 
GKLSMP VQRLLBRHAIGDR WS YTLGRRLPTDHVSGQLQFRFE I 
TSS IH PDDEE I SLSTE P ESAQI QDSPMNNLME SGSGS PRSEAPE 
SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 
VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGBEASALLLE 
DGEAPASTKEEPLEEBATTQSRAGRBEEEKBQEEEGDVSTLEQG 
EGRLQLRASVKRKSRPCSLPVSELETVIASACGDPETPRTHYIR 
IHTLLHSMPSAQGGSAAEEEDGAEEESTIiKDSSEKDGLSEVDTV 
AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSC YS AS C YS P SCYNGNRFASHTRFSS V0S AKI S ESTVFS SQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 
S PEGLES PVAG PSNRR EGECP ILHNS QP VSQLPSLRPEHHHYPT 
I DE P LPPNWE ARIDS HGRVFYVDHVNRTTTWQR PTAAATPDGMR 
RSGSrQQMEgLNRRYQNIQRTIATBRSEBDSGSQSCEQAPAGGG 
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SEQ 
ID 
WO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence ■ 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containuig signal peptide • 
<A»Alanine, C=Cysteine, D»Aspartic Acid, E- 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L-Leucine, M^Methionine, N=Asparagine, 
P-Proline, Q=Glut amine, R«Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=>Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5910 






GGGGSDSKAKSSQSSLDLRREGSLSPVNSQKITLLLQSPAVKFI 
TNPBPFTVLHANYSAyRVFTSSTC^KHMILKVRRDARNFERYQH 
NRDLVNFINMFADTRLELPRGWE I KTDQQGKSFFVDHNSRATTF 
IDpRIPLQNGRLPNHLTHRQHLQRLRSYSAGEASEVSRNRGASt* 
LARPGHSLVAAIRSQHQHESLPLAYNDKIVAFLRQPNIFEMLQE 
RQPSLARNHTLREKIHYIRTEGNHGLEKLSCDADLV1LLSLFEB 
B IMS YVPLQAAFHPG YS FS P R CS P CSS PQNS PGLQRAS ARAPS P 
YRRDFEAKLRNFYRKLEAKGFGQGPGKIKLI IRRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGP3RBFFFLLSQELFMP 
YYGL FE YS ANDT YTVQI S PMS AF VENHLEW FRFS GRILG \ LALI 
HQ YLLDAFFT \R P FYKALL \ R L PC \ D\ LSDU2 YLDEE FHQS LQW 
MKDNNITDILDLTFTVNEEVFGQVTERELK9GGANTQVTEKNKK 
EYIBRMVKWRVERGWCQTEALVRGFYEWDSRLVSVFDARELE 
LVIAGTAEIDLNDWRNNTEYRGGYHDGHLVIRWFWAAVERFNKE 
QRLRLLQ FVTGTSSVP YEGFAAP PW E PMGLRR FLP * KKWGKITS 
LPPRG\HTCLQPDWDLPTVS PRTPMLYEK\LLTA\VEETSTFGT 


5911 


1526 


| 446 


VA2 FAAM E PGRTQ I KLDPRYTADLLE VLKTN YGl P S ACFS Q P^>T 
AAQLLRALGPVE LALTS ILTLLALGS I AI FLEDAVYLYKNTIiCP 
I KRRTLLWKSSAPTWSVLCCFGLWI PRSLVLVEMTITS FYAVC 
FYLLMLVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRL 
LLTRKKLQ\R*CWALSNTPS *R* R* PWWACFSSPTASMTQQTFE* 
RGAQLYGSTLSSA/CSTLLALWTtiGI ISRQARLHLGEQNKGAKF 
ALFQVLL I LTALQ PS I FS VLANGGQ IACS PP YS S KTR3 Q VMNCH 

LLILBTFLMTVLTRMYYRRKDIIKVGYETFSSPDLDIjNIjKALRWM 
AWTMKGCCTH 


5912 


109 


595 


QiiPLAPCIQliKGbEMRSPKPQSFIIRSSHSGAGIiLVkNPSTPVF^ 

cghrrggaafkykptpvvgpeqrptgqkhmrggvsllsprlecs 
gtisahcnlrlpsssnspapas»lagitgvchhaqi,ifvflvet 

GFHHVGQAGLELL/ETWIHIjPRPPECVLGLQA 




924 


277 


milnkalmlgalalttvmspcggedivadhvasygvnlyqsygp 
sgqyshbfdgdeefyvdlerketvwqlplfrrfrrfdpqfaltm 

IAVLKHbTLNI VIKRSNST7AATNEVPEVTVFSKSPVTLGQPNTLI 
CLVDNIFPPWNITWLS^HSVTEGVSETRPSSPKSDHFr^QDQ 

vtspsfpfe**dl*takveqlgawfepllkhwgabipttl 


5913 


46 


1198 


QLRMAGAEGAAGkQSELaPWSLVDVLEEDEKI^KEACAVLGGS 

dsekcsysqgsvkrqalyacstctpegeepagiclacsyechgs 

HKLFBLYTKRN FRCDCGNS KFKNLECKLLPDKAKVNSGNKYKDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
SGDFQEKVCQACMKRCSFLWAYAAQLAVTKIST\GMMDWCGTLM 
B * /DDQEV1 KPENGEHQDSTLKEDVPEQGKDDVREVKVEQNSEP 
CAGS SSESDLQTVFKNES LNAE S KSG C?CL QELKAKQL IKKDTAT 
YWPLNWRSKLCTCQDCMKMYGDLDVLFXTDEYDTVLAYENECGKI 
AQATDRSDPLMDTLSSMNRVQQVELIC/GIQ*FED 




5914 


960 


124 * 


NLGGSELPPEi^FIQVAS^QRRVDFYIiASIEI^LVAl/GGR¥~ 
ENGALSSVFTYS PKTDSWS YVAGLPRFTYGHAGTI YKDFVYISG 
GHDYQIGPYRKNLI^YDHRIT)VV7EERRPMTTARGWHSMCSLGDS 
IYSIGGSDDNIESMERFDVU3VEAYSPQCKQWTRVAPLLHANSE 
SGVAVWEGRI Y IU3GYS WENTAFS KTVQVYDREADKWSRGVDLP 
KAIAGGSACP IAP* SIX3QRTRKRKAKARGTRTGASDPS CASWDH 
PHRHL PGIrCRPAATS 




5915 
5916 


1604 


703 


FPGRPTRPLKLQRRRKRARIIQAPHClISPRPR^CPPRAT.naDT?a 
PASRAEGPVAWVNGHTEGPAPARSAP KEP PGL PRPLGSFP CPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP /PGLPS I 
PRELPGKEPSAHPVHQGLPAERRGP)W3RVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPBGL* +AAGPAAH 




5917 


256 
~~ 1343 


633 

\ 

827 - / 


3 PRM HEI WGP WHR WES FSLEG E WPS R I PEPS PDS TKGTSGKGCR 
rVTGAVHRHLNHVAGI IPWVLHSQLKPTAATAQDQWTSQQYPDH 
a TRLI LQ *NQATADKNN* TTALLQ PHQRlA VS PRMAEA 

^HQir ) TYLEP/ICLWWYMKIl,TVFLTKSVLEI*KFIHTPQTYR 
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SEQ 

1 ID 
MO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


* ,v/ ocyiucm, (.uaLdining signal peptide 
(/WUanine, C=*Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, ^Phenylalanine, G=Glycine f 
H=Histidine, I»Isoleucine, KaLysine, 
L=Leucine, M»Methionine, NaAsparagine, 
P-Proline, Q«Glut amine, R*Arginine, 
S=Serine, ^Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»stop 
Codon, /^possible nucleotide deletion, 
\=possibla nucleotide insertion) 








?*NDFFG I KEVyVSRRLRKTSF^ RLAVTPLBQAWS KBCVPVDQ 
PMEHLLPSLLSLASDPVPNVRVLLAKALKQMLLEKAYPRNAGNP 
HLEVIEETILALQSDRDQDVSPPAAIiEPKRRNIIDTAVLEKQN 


5918 


13 


1247 


EGAQVARRRS RRQWRAGRCGRGRGGRRAERTGGRGPPf;R pout p 
PGPARRGRRRMETPFVGDEALSGLGGGASGSGGTFAS PGRLFPG 
APPTAAAGSMMKKDALTLSLS EQVAAALKP A PAPAS Y p PA\ ADG 
APSAAPPDGLLASPDLGLLKLASPELBRlillQSNGLVTTTPTSS 
Q FLYPKVAAS EEQE FAEGF VKALEDLKKQNQLGAGRAAAAAAAA 
AGGPSGTATGSAPPGEIAPAAAAPEAPVYA\NLSSY\AGGCRGL 
RGGAAT\VAFAAEPVPFPPPPPPGALGPRRP/RLALQGRRPQTV 
PDVP\SFGES P\PLS P IET\DTPRRI \KAKRKRL\RNPQIRAPK 
PASRKLGAQSRALBRESEDPS*SPEHGSLASTASLLREQVAQUC 
QKVLSHVNSGCQLLPQHQVPAY 


5919 


1 


4254 


TS VQGDSQGTPTS5 QGS INMKHW I S QAI HGSTTSTTS SSSTQSQ ' 
GSGAAHRIiADVMAQTHIENHSAPPDVTTYTSEHSIQVERPQGST 
GSRTAPKYGNAELM ETG DGVPVS SR VSAKIQQLVNTL KR PKRPP 
LREFFVDDFEELLEVQQPDPNQPKPEGAQMIAMRGEQLGVVTNW 
PPS liEAALQRWGTISP KAPCLTTMDTNG KPLYI LTYGKLWTRSM 
KVAYSILHKU3TKQEPMVRPGDRVALVFPNWDPAAFMAAFYGCL 
LAE WPVP I EVP LTRKDAGSQQ IG FLLG S CGVTVALTSDACHKG 
LPKSPTGBIPQFKGWPKLLWFVTESKHLSKPPRDWF\PHIKDAN 
NDTAYTEYKTCK\DGSVLGVTVTRTALLTHCQA1»TQACGYTEAE 
TIVNVLDFKKDVGLWHGILTSVMNMMHVISIPYSI^KVNPLSWI 
QKVCQ YKAKVAC VKS RDMHWALVAHRDQRDI NLSS LRML I VADG 
ANPWS ISSCDAFLNVFQS KGLRQEVI CPCASSPEALTVAIRRPT 
DDSNQ PPGRG VLSMHGLT YGV I RVDS EE KLS VLTVQDVGLVMPG 
AIMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGLSGMTKNT 
FE VFAMTSS GAP I SE Y P FI RTGLLG F VG PGG LVF WGKMDGLMV 
VSGRRHWADDIVATALAVEPMKFVYRGRIAVFSVTVLHDERIVI 
VAEQRPDSTEEDS FQWMS RVLQAI DS IHQVG V YCLALVPANTLP 
KTPLGG IHL SETKQL F1»EGS LH PCNVLMCPHTC VTNLPKPRQKQ 
FBIGPASVTfVGNLVSGKRIAQASGRDIiGQIEDNDQARKFLFLSE 
VLQ WRAQTTPDHI LYTLLNCRGA IANS LTCVQLHKRAE K I AVML 
MBRGHLQDGDHVALVYPPGIDLIAAFYGCLYAGCVPITVRPPHP 
QNI ATTIiPTVKM I VEVSRSACLMTTQL I CKLLRS REAAAAVDVR 
TWPLILDTDD* PKKRPAQI CKPCN PDTtiA YLDFS VSTTGMLAGV 
KMSHAATSAFCRSIKLOCELYPSRRVATfT.nPYfvaT/imrr rarr r> 
SVYSGHQSILI PPSELE TNPALWLLAVSQYKVRDTFCS YSVMEL 
CTKGLGS QTES LKARGLDLS RVRTCVVVAEERPR I ALTQS FS KL 
FKDI^LHPPAVSTSFGCRVNLJUCLC^TSGPDP 
HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 
I/3EIWVHSAHNASGYFTIYGDESLQSDHFNSRLSFGDTQTIWAR 
TGYLGFLRRTBLTDANGERHDALYWGALDEAMELRGMRYHPID 
lETSVIRAHKSVTECAVFTWTNLLVW^/SLDGSEQKALDLVPLV 

TNVVLEEHYLIVGVWVVDIGVIPINSRGEKCRMHLRDGFLADQ 
LDPIYVAYNM 


5920 


1381 


1499 


qi^.vahagvsripp+lfpplhptflslwclhhklp/hppgasm 

VRPPVVPRRPPAHI SSVRQASTQVPRTVPHTQRVANIGTQTTGP 
SGVGCCTPGRPLLPCKCSS AAHSTY RVQE PAVHI PGQEPIiTASM 
LAAAPLHEQKQMIGERIiYPLIHDVHTQLAGKIl'GMLLEIDNSEI* 
LLMLES PES LHAK I DEAVAVLQ AHQAME Q P KAYMH 


5921 
5922 


727 
2475 


157 
495 


VCPGTGGE *GLWGQLGGI>PKETPLKPMDAFTGSGLKRKFDDVDV 
GSSVSNSDDEISSSDSADSCDSLNPPTTA5FTPTSILKRQKQLR 
RKNVRFDQVTV YYFARRQG FTS VPSQGGS SLGMAQRHNS VRS YX 
LCEFAQEQEVNHRE I LREHLKEE KLHAKKMKI*TKNGTVESVEAD 
GLrLDDVSDEDIDVENVEVDDYFFLQPLPTKRRRALLRASGVHR 
IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAG I KCQVD 
RMSFPCGCSRDGOGNMAGR I E FNPI R VRTHYLHTIMKLELESKR 

Q\GAAQQPQ\*GALPDCQLQPDRSTGL* DPS WIGS KGLSFTGKG 
AAATHLI ILRVTENRGAEGKRK 

SYSNWGLFPSVFIQVPRSRTGWliKPIFLFYS rYE \ CMET£iKG\T 1 
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SBQ~ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


preaxctea end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acad segment containing signal peotide~~ 
(A=Alanine, C=Cysteine, D=Aspartic Acid* B= 
Glutamic Acid, F« Phenyl alanine. G=Glycine, 
H=HiBtidine, I«Isoleucine, KaLysine, 
L»Leucine, M=Meth.ionine, N=Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T-Threonine , VoValine, 
W«Tryptophan, Y-Tyrbsine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CLYNATQYKVC3PRMDRPDACYNPSEPAATTVFEIRTGLl,LGDT" 

SKIITRTBEKE IPKQI TLRFDACAAINSKKLEIGCGSLN *ERS* 

RVEWKYVCHESGVCKNCAYWPCVI*AT*KKNKMDSVYLQXGEAN 

PSCAAGHCNPLBLI1TNPLDPHWKKGE32VTLGXNRTCLKPQVVI 

LIKGEVHKCSPXPVFQTFYSELNLPAPELLKKTKNLFIK3IAENV 

IFLLNGTSCYVRGGTTIGDRWPWEA*ELVPTDPAPDI I P I * KAE 

ASNF* VLKTS I IRQYCIARSGKDFIXPVGKPNCIGQKLYTJSTTK 

TIT* *DLNHTEXNPFSKFSKLKTA*AHAESH*DWTVPSGLY* I C 

RHRAYFRLPNKWADSCVIGTIKPSFFLLPIKNGELLGFSVYASR 

EKKGIVIGNWKDNEWPRERIIQYYGPATWAQDGSWGYR/TP/VY 

MUIWIIRLQAILEIISNETGRALTVLAWQETQMRNAIYQNRLAL 

DYLLVAEGCVCRKFNLTNCCLQINDQGQVVKNIVRDMTKLAHVP 

IQVWHKFDPESLFGKWF PA IG3FKTL I VGVLLV IRTCLLL PC VL 

PLL FQMI KG IVATLVHQ KTS AHVNYMNHYRS I SQRDSKSEDESE 

NSH 


5923 
" 5924 


137 


638 


QLCGRRGQR FRTS IKRMHPI * RTCPNTNL/ 1 ILLSQENTQtRbL 
QQENRE L W I SLE EHQDALELI MSKYRKQMLQLMVAKKAVDAE P V 
LKAHQSHSAEIESQIDRICTMGEVMRKAVQVDDDQFCKIQEKLA 
QLELENKELRELLS ISS ESLQARKBNSMDTASQAI K 




274 


2146 


EKGKVKDAGAEQWISLSLSCKGSWBTQySNHl.NSL'rPPTSVRRM - 
PLITTVTliLKMVARHHKKLLCSKAPSTQLQQKIFIiHSQMGIHHQ 
SVCMKLKPNTSHIISILMGQPMALVQIiETiAPLTIIIQKFQTQD 
HMKFW KNLPLHSHHt»TPS VPQTVI PKKTGSPEIKLKI TKTIQNG 
REIjFESSLCGDLIjNKVQaSE\Q*NQSIESRKEKRKKSNKKDSSR 
S EERKSHKI PKLEP EEQNR PNER VDTVS E KPREEPVLKEGS P S S 
ANTI FCSNNGSVHW\FKFQVGDLVWSKVGTYFWWPCMVSSDPQL 
EVHTK INTRGARE YH VQFPSNQPERAW VHEKR VREYKGHKQ YEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
REER I EQ YTF IY'IDKQPEEALSQAKKSVASKTEVKKTRRPRS VI* 
NTQPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEBPPPVKIAW 
KTAAARKSLPAS ITMHKGSLDLQKCNMS PWKIEQVFALQNATG 
DGKF I DQ FVYSTKG IGNKTE I S VROQDRL I I STPNQRNEKPTQS 

VSSPEATSGSTGSVEKKQQRRSIRTRSESEKSTEWPKKKIKKE 
QVGFLHVES 


5925 


216 


1911 


MMTAESREATC5LSPOAAQEKDGIVIVKVEBSDEBDHMWGQDSTL 
QDTPPPDPEIFRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 
INTKEQILELLVLEQFLSILPKELQVWLQEYRPDSGEEAVTLLE 
DLELDLSGQQVPGQVHG PEMLARGMVPLDP VQESS SFDLHHEAT 
QSHFKHSSRKPRLLQSRALPAAHIPAPPHEGSPRDQAMASALFT 
ADSQAKVKI 3DMAVSL I LEE WGCQNLARRNLS RDNRQENYGS AF 
PQGGENRNENEESTSKAETSEDSASRGBTTGRSQKEFGEKRDQE 
GKTGERQQKNPEEKTRKBKRDSGPAIGKDKKTITGERGPREKGK 
GLCRS FS IiS SNFTTP EEVPTGTKSHRCDE CGKCFTRS SSLIRHK 
IIHTGEKPYECSECGKAFSSLNSVNLVLHQRIXHTGEKPHECNE 
CGKAFSHSSNLHiHQRIHSGEKPYECNECGKAFSQSSD\LTKHQ 
R IHTGEKP YECSECGKAFNRNS YLI LHRRVHTREKP YKCTKCGK 
\AFTRSSTLTLHHR IHARERASEYSPASLDAFGAFLKSCV 


5926 
5927 


2 


535 


DRCLMLKQGSQPGSPPAT/ CBPPAPPVYQAPCQSCPEPPGAHEP " 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 




4146 


1248 

] 
J 


KHFS KFGSQAJj Y UiiKRPASGQNS I SVMPAQ KITKPAAKYGI pla 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVirrGEBRRKI SEEAAR 
KRR LEFIEKE K KQKDQ IIS LMKAEOMKROB KERLERINRAJ? pnr 
W RNVLSAGGS GE VKAPFLGSGGT 1 APS S FS SRGQYEHYHAI FDQ 
MQQQRAEDNEAKWKREIYGRGLPERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMGILQN1AAMYGGRPSSSRGGKPRNKEBEV 
YIARLRQI RLQNFNERQQI KAKIiRGEKKEANHSEGQBGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGV7CSSDVSPPr*GQHETGGSPSKQ0MRSVISVTSALKEVGVDS 
SLTDTRETSEEMQKTNNAlSSKREIt,RRLNBNUCAQEDEKGKQN 
LSDTFEINVHEDAKEHBKEKSVSSDRKKWEAGGQLVIPLDELTL 
yiS FSTTERHTVGEVI KK3PNGSPRRAWGKS PTDS VLKI LGEAE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A-Alanine, C=Cysteine, D=Aspartic Acid, B=. 
Glutamic Acid, F=Phenylalaiiine, Q=Glycine, 
H*Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=*Proline, Q=Glut amine, R»Arginine, 
S= Serine, ^Threonine, VoValine, 
W»Tryptophan, Y=Tyrosine, X=Un)cnown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








wwiaiujaiM L 1J.KS.U. 15 PEGEKr KFJjITGEKKVQCISHEINPS 
AIVDS PVETKS PEFSEAS PQMSLKLEGNLEBPDDLETE IijQEPS 
GTNKDE\SLPCTITDVWISEEKBTKETQSADRITIQBNEVSEDG 
VSSTVDQLSDIHIBPOTNDSQHSKCDVDKSVQPBPFPHIO/VHSE 
HLNLVPQVQSVQCSPEESFAFRSHSHIiPPKNKNKNSLLIGLSTG 
LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLEI DEI 
EDEN I KEG PSDS ED I VFEETDTDLQELQAS MEQLLREQPG EE YS 
EEEESVDKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 

gbiasecscdsvfnhleelri>:leqemgfekffevyekikaihe 

DBD EN I E I CS KI VCNI LGNEHQHL YAKILHLVMADGAYQBDNDE 


5928 


4146 


1248 


KHF SKFGSQA1.YQLKRPAS GQNS I S VMPAQK I TK PAAKYG~I PLA 

YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 

KRRLEFIEKEKKQKDQIISLMKAEQMKRQEKERLERINRAREQG 

WKNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 

MQQQRAEDNBAKWKREIYGRGLPERQKGQLAVERAKQVEEFL2R 

KREAMQNKARAEGHHGILQNLAAMYGGRPSSSRGGKPRNKEEEV 

YLARLRQIRLQNFNERQQIKAKLRGEKKEANHSEGQEGSEEADM 

RRKK\ IESLKAHANARAAVLKEQLERKRKEAYERBKKVWEEHLV 

AKGVKSSDVSPPLGQHETGGSPS KQQMRSVI SVTSALKEVG VDS 

SLTDTRETSEEMQKTNNAISSKREILRRLKENLKAQEDEKGKQN 

LSDTFEINVHEDAXEHBKEKSVSSDRKKWEAGGQLVIPLDELTL 

DTS FSTTERHTVGEVIKLGPNGSPRRAWGKS PTDS VLKILGEAE 

LQLQTELLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPS 

ATVDS PVETKS P EFSEAS PQMSLKLEGNLEEPDDLETEILQE PS 

GTNKDE\SLPCTITDVWlSEBKErKBTQSADRITIQENEVSEDG 

VSSTVDQLSDIH IEPGTNDSQHSKCDVDKSVQPE PFFHKWHSE 

HLNLVPQVQSVQCSPBESFAFRSHSHLPPKNKNKNSLLIGLSTG 

LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLE IDEI 

EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 

EEEES VLKNSDVEPTANGTDVADEDDNPS SESALNEE WHSDNS D 

GE IAS ECECDSVFNHLEELRLHLEQEMG FEKFFE V YE KI KAI HE 

D3DENIEICS KI VQNT IX3NEHQHI.YAKI LHI>VMADGA YQEDNDE 


5929 
5930 ■ 


3 


1558 


LDFSMTTQLFAYVAILIiFYVSRASCQDTFTAAVYEHAAILPNAT 
LTPVSREEALALMNRNI »D ILEGAITSAADQGAH 1 1 VTPEDAI YG 
WNFNRDS L YPYLED I P DP E VNW I PCNNRNRFGQTP VQERLS Ch \ 
AKNNSI Y WANIGDKKPCDTSDPQC PPDGRYQYNTDWF\DSQG 
KLVARYHKQNLFMGENQFNVP KEPE I VTFNTTFGS FG I FTCFD I 
LFHDPAVTLVKD FHVDT I VFPTAWMNVLPHLS AVE FHS AWAMGM 
RVNFLASNIHYPSKKMTGSGIYAPNSSRAFHYDMKTEEGKLLLS 
QLDSHPSHSAVVNWTSYASSIEALSSGNKEFKGXVFFDEFTFVK 
LTGVAGNYTVCQKDLCCHLSYKMSENIPNEVYALGAFDGLHTVE 
GR YYLQ I CTLLKCKTTNLNTCGDS AETAS TRFEM FSLSGTFGTQ 
YVFPBVIJ*SENQIAPGEFQVSTDGRIiFSI»KPTSGPVLTVTLFGR 
LYEKDWASNASSGL?AQARIIMLIVIAPIVCSLSW 




113 


60B2 

1 
< 
1 
I 
1 


rgncfwivpftmaqrtgledper^Lfvdraviynpatqadwtak - 

KLVWIPS ERHGFEAAS I KEFJIGDEVMVELAENGKKAMVNKDD IQ 

kmnppkfskvedmaeltclneasvlhnlkdryysgliytysglf 

CWINPYKNLPIYSENIIEMYRGKKRHEMPPHrYAISESAYRCM 
LQDREDQS ILCTGESGAGKTENTKKVIQYIiAHVASSHKGRKDHN 

ipge\lbrqllqanpilesfgnartvqndnssrfgkfirinfdv 

TGYIVGANIETYLLEKSRAVROAKDERTFHI F YQLLSG \ AGEHL 
KSDLLLEGFNNYRFLSNGYTPIPGQ\QDKGNFRGDPGEAI>1HIMG 
FSHEEILSMLKWSSVLQFGNISFKKERNTDOASMPENTVAQiCL 
CHLLGMNVME FTRAI LTPRIKVGRD YVQ KAQTKEQADFAVEALA 
tCATYBRLFRWJLVHRINKALDRTKRQGASFlGILDIAGPEIFELN 
SFEQLCJ^TNEKLQQLFNHTMFILEQEEYQRBGIEWNFIDFGL 
DLQPCIDLI ERPANPPGVLALLDEECWFPKATDKTFVBKLVQEQ 
3SHS KFQKPROiKDKADFCIIHYAGKVDYKADEWljMKNMDPljND 
JVATL LHQ S S DRFVAELW KDVDR I VG L DO VTGMTE TA FG S A Y KT 
CKGMFRTVGQLYKESLTKLMATLRNTNPNPVRCI IPNHEKRAGK 
jDPBiVLDQIiRCNGVIJEGIRICRQGFPNRI VFQEFRQRYEI LTP 
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SEQ~ 
ID 
NO: 


Predicted ' 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
oeguence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containina sicmai n P nr i h« — I 
(A=Alanine, c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
iUHistidine, I=Isoleucine, K=Lysine, 
l>=Leucine, (^Methionine, N^Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S -Serine, T«Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5931 






NAIPXGFMLKiKUACBRMlRAtfelLDPNLYRlGQSia F^RAGVLAH~ 

LKKKRDLKITDII3FFQAVCRGYLARKAFAKKQQQLSALKVLQR 

NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 

BKQTKVEGELEBMERKHQQLLEEKNILAEQLQAETBLFAEAEEM 

RARLAAKKQELEEILHDLBSRVEEBEERNQILQNEKKKMQAHIQ 

DLEEQLDEEEGARQKLQLEKVTAEAKTKKMEEEILLLEDQNSKF 

I KE KKLMEDRI AE CSSQLAEEEEKAKNLAKIRNKGE^MI SDLEE 

RLKKEE KTRQELE KAKRKLDGETTDLQDQ IAELQAQIDELKLQL 

AKKEEE LQGALARGDDE TLHKNNAL XWRELQAQ IAELQ EDFES 

EKASRNKAEKQKRDLSEELEALFCTE^EDTLDTTAAQQELRTKRE 

QEVAELKKALEEETKNHEAQIQDMRQRHATAliEELSEQLEQAKR 

F KANLE K^QGLETDNKEIiACB VKVLQQ VKAES EHKRKKLDAQV 

QELHAKVSEGDRLRVBLAEKASKLQNELDNVSTLLEBAEKKGIK 

FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 

EQQEEEEEARKNLEKGVLALQSQLADTKKKVDDDLGTIESLEEA 

KKKLLXDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 

HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 

KETKALS LARALEEALEAKBEFERQNKQLRADMEDLMSSKDDVG 

KNVHELEKSKRALEQQVXEEMRTQIiEELEDELQATEDAKLRLEV 

NMQAMKAQ FERDLQTRDEONE EKKRLL I ItnVR v t pattt nnBom 

RALAVAS KKKME IDLKDLE AQIEAANKARDEVI KQLRKLQAQMK 

DyQRELEEARASRDEIFAQSKESBKKLKSLEAEILQLQBELASq 

ERARRHAEQERDEliADEITNSASGKSALLDEKRRLEAR IAQLEE 

ELEEBQSNMELLNDRFRKTTLQVDTLNABIAABRSAAQKSDNAR 

QQLERQNKELKAKLQELEGAVKSKFKATISALRAXIGQLEEQLB 

QEAKBRAAANKLVRRTEKKXKEIFMQVEDERRHADOYKEQMEKA 

NARMKQLXRQLEEAEEEATRANASRRKLQRELDDATBANEGLSR 

EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTBSK 
TSDVNETQPPQSE 




113 


6082 

I 
( 

] 


RGNCF W I VP FTWAQRTGLEUFER YL F VDRAV Vit\ PATQADWTAK 
KLVWIPS ERHGFEAAS IKEERGDEVMVELAENGKKAMVNKDD IQ 
KMNPPKFSKVEDMAELTCLNEASVLKNLKDRYYSGLIYTYSGLF 
CVVXNPYIQTLPIYSENIIBM^GKKRHEMPPHIYAISESAYRCM 
LQDREDQS I L CTG E SGAGKTENT KKV IQ YLAHVAS SHXGR KDHN 

IPGE\LBRQLLQANPILESFGNARTVQHDNSSRFGKFIRINFDV 
TG Y I VGAN IET YLLEKSRAVRQAXDERT FH I FYQLLSG \AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAMHIMG 
FSHEEILSMLKVVSSVLQFGNIS FKKERNTDQASMPENTVAQKL 
OHLLGMNVME FTRAI LTPRI KVGRDYVQ KAQTKEQADFAVEAIiA 
KATYERLFRWLVHRINKALDRTKRQGAS F IGI LDI AG FEIFELN 
SFEQLCINYTKEKLQQLFNHTMFILEQEEYQREGIEWNFTDFGL 
DLQPCIDIiIERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
GSHS KFQKP RQLKDKADFCI I HYAGKVD YKAD SWLMKNMDPLND 
NVATLLHQS SDRFVAELWKDVDRI VGLDQ VTGMTBTAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI IPNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYErLTP 
KAI P KG FMDGKQ ACERM I RALELD PNLYR I GOS K I F FRAG VLAH 

LEEERDLKITDIIIFFQAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDBELLKVK 
EKQTKVEGELEEMERKIIQQLLEEKNILAEQLQAETELFAEAEBM 
RARIiAAKKQELEEILHDL ESR VB EEEE RNQ ILQNEKKKMQAH I Q 
DLEEQIjDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 
i MbPJbJjntLUH. XAhCS S QIjAEBEEKAKNLAKI RNKQEVM IS DLEE 

RLKKEEKTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKLQL 
^KKEEEI^AIiARGDDETLHKNNALKWRELQAQIAELQEDFES 
?iOlS RNKARKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
2E VAELKKALEEETKNHEAQ IQDMRQRHATALEELSEQIiEQAKR 
? KANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 
JELHAKVSEGDRLRVEIABKASKLQNELDNVSTLLEEAEKKGIK 
"AKDAAS LES QLQDTQBLLQEETRQ KLNLSSR I RQLBEE KNSLQ 
"QQREEEEARKNLEKQVIjALQSQIiAJDTKKyTODDLGTI 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenyl alanine, G»Glycine, 
H^Histidine, I-Isoleucine, K=Lysine, 
L-Leucine, M-Mcthionine, N=Asparagine, 
P-Proline, Q=Glut amine, R=Arginine, 
S=Serine , T=Threonine , V«Val ine , 
VMryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKKLLKDAEALSQRLEE KALAYDXLEKTKNRLQQELDDLTVDLD 
HQRQVASNLEKKQ \KKFDQLLAEEKS ISARYAEERDRAEABARE 
KETKALSLARALEBALEAKEEFERQNKQLRADMEDLMSSKDDVG 
KNVHELEKSKRALEQQV\ EEMRTQLEELEDELQATEDAKGRLEV 
NMQAMKAQ FERDLQTRDEQNE E KKRLL I KQVH ELBA ELEDERKQ 
RALAVASKKKMEIDLKDLEAQI EAANKARDEVIKQLRKLQAQMK 
DYQRELEE ARASRDE I FAQS KES EKKLKSLEAEI LQLQE 3LASS 
ERARRHAEQERDELADEITNSASGKSALLDBKRRLEARIAQLEE 
ELEEEQSNMELLNDRFRKTTLQVDTLNAELAAERSAAQK3DNAR 
QQ LERQNKELKAKLQE LEGAVKS KFKATI S ALEAKIGQCEEQLE 
OEAKERAAANKLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 
NARMKQLKRQ LE EAE E EATRANAS RRKLQRE LDDATE ANEGLS R 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQ3E 


5932 


33 


$72 


RHIiEEICFLFLQKGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FGATLAVGLT I ? VLS WTI I ICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQYP? PYPAQPMGP PAYHBTLAGGAAAPYPASOPPYNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKLKMADKTPGGSQKAS5KTRSSDVHSSGSSDAHMDASG"P5D " 

SDMPSRTRPKSPRKHNYRNBSARES LCDS PHQNLSRPLLBNKLK 

AFSIGKMSTAKRTLSKKEQEELKKKEDEKAAAEXYEEFLAAFEG 

SDGNKVKTFVRGGWNAAKEEKETDEKRGKIYKPSSRFADQKNP 

PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 

QEERDERHKTKGRbSRFEPPQSDSDGQRRSMDAPSRRNRSSGVIi 

DDYAPGSHDVGDPSTT\NFYLGN I \NPQMNLKKCCCCEFGRFGP 

IASVXIMWPRTDEERARERNCGFVAFMNRRDAERALKNJUIJGKMI 

MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 

RERLKWPNAPMLPPPKNKEDF^KTLSQAIVKVVIPTERNLLALI 

HRMIEFWREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 

KLYS ILQGDSPTKWRTBD FRMFKNGS FWRPPPLNPYLHGMS E EQ 

ETEAFVEEPSKKGALKE EQRDKLEE I LRGLTPRKNDIGDAMVFC 

LNNAEAAEEIVDCITESLSILKTPLPKKIARLYLVSDVLYNSSA 

KVANAS YYRKFFETKLCQ I FSDLNATYRTIQGHLQSENFKQRVM 

TCFRAWED WAI YPEPFL IKLQNI FLGLVNI I EEKETKD VPDDLD 

GAP I EEELDGAPLEDVDG I P IDATP I DDLDG VP I KSLDDDliDG V 

PLDATEDS KKNE PI FKVAPSKWEAVDESELEAQAVTTS KWELFD 

QHEESEEEENQMQEEESEDEEDTQSSKSEEHELYSNPIKEEMTB 

SKFS KYSEMSEEKRAKLREI BLKVMKFQDBLESGKRPKKPGQS F 

QEQVBHYRDKLLQREKEKELERERBRDKXDKEKLESRSKDKKEK 

DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 

SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 

KKSGKKSRSQSRS PHRSHKKS KGKTNTGRKFFKKAVT YWKCDL F 

LCPERSVF 


5934 


1 


3190 


GTRKLKMADKTPGGSQKAS S KTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHONLSRPLLENKLK 
AFS IGXMSTAKRTLSKKEQEELKKKBDEKAAAEI YEE FLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYXPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFBPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGS HDVGD PSTT\ N FYLGN I \ NPQMNLKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MSPEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKWIPTERNIjL^ 
HRMIEFVVREGPMFEAMIMNREINNPMFRFLFENQTPAHVY YRVf 
KLYSILQGDSPTKWRT2DFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVE E PS KKGALKE EQRDKLEE I LRGLTPRKND IGD AMVFC 
LNNAEAAEE I VDCI T ES LSI LKTP LP K KI ARLYLVSDVL YNSS A 
KVANAS YYRKFFETKLCQI FSDLNATYRTIQGHLQSENFKQRVM 
TCFRAWEDWA I YPB PFL I KLQN I FLGLVNI I BEKE TE D V PDDLD 
GAP I EEELDGAPLEDVDGI PI DAT PI DDLDG VP I KSLDDDLDGV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, GnGlycine, 
H=Hisfcidine, I=Isoleucine, K= Lysine, 
L=Leucine, M~Wethionine, N-Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
$=Serine, T=Threonine, VsValine, 
W=Tryptophan # Y=Tyrosine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLDATEDS KKNEPI FKVAPS KWEAVDES ELE AQAVTTS KWELFD 
QHEESEEEENQNQBEESEDEEDTQSSKSEEHHLYSNPIKEEMTE 
SKFSKYSRMSEEKRAKLREIELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDXLLQRE KEKELBRERERDKKDKE KLBS RSKDKKEK 
DECTPTR KERKRRKSTS P S PSRS SSGRRVKS PS PKS ERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSG KKS R S QS RS PHRS HKKS KGKTNTGRKFFKKAVT YWKCDLF 
LCPERSVP 


593S 


| 3 


4433 

.- - 

■* 


SYWLSGWRLSRPPRQ^AGHRGIGRFGTMAPVHGDDCEIGASAL 
SDSGSFVSSRARREKKS KKGRQBALERLKKAKAGERYKYEVEDF 
TGV YEEVDEEQ YS KLVQ ARQDDDW I VDDDGI G YVEDGRE I FDDD 
LEDDAIiD ADE KGKCGKARNKDKRNVKKLAVTKPNN I KSM F I ACA 
GKKTADKAVDLS KDGLLGDILQDLNTETPQIT PPPVMILKKKRS 
IGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPLKRAEFAG 
DDVQVESTEEEQESGAMEFEDGDFDEPMEVBEVDLEPMAAKAWD 
KESEPAEEVKQSADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 
VQEVQ VDSSHL PLVKGADE EQVFHFY WLDAYEDQ YNQ PG WFLF 
GKVWI ESAETHVSCCVMVKWIBRTliYFLPREMKIDLNTGKETGT 
FI SMKDVYEE FDEKIATKYKIMKFKS KPVEKNYAFEIPDVPEKS 
EyLEVKYSAEMPQLFQDLKGETFSHVFGTNTSSLSLFLMNRKIK 
GPCWLEVKKSTALNQPVSWCKVEAMALKPDLVNVIKDVSPPPLV 
VMAFSMKTMQNAKNHQNE 1 I AMAALVKRS FALDKAA PKP PFQSH 
FCWSKPKDC I FPYAFKEVI EKKNVKVEVAATERTLLGFFLAKV 
HKIDPDI I VGHK I YGFELEVLLQRINVCKAPHWS KIGRLKRSNM 
PKLGGRSGFGERNATCGRMICDVEISAKELIRCKSYHtiSEliVQQ 
ILKTERWIPMENIQNMYSESSQLLYLLEHTWKDA\KFILQIMC 
ELNVLPLALQITNIAGNIMSRTLMGGRSERNBFLLLHAFYENNY 
1VPDKQ IF"RKP QQKLGDEDEE IjDGDTNKYKKGRKKG AYAGGLVL 
DPKVGFYDKFILIiLDFNSLYPSIIQEFNICFTTVQRVASBAQKV 
TEDGEQEQIPELPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 
LNPDLILQYO IRQKALXLTANSMYGCLGFSYSRFYAKPLAALVT 
YKGRE ILMHTKEKVQKMNLE VI YGDTDS IMINTNSTNLEBVFKL 
GNKVKSE VNKL YKLLE I Dl DGVFKSLLLLKKKKYAAL WEPTSD 
GNYVTKQELKGLDIVRRDWCDLAKDTGNFVIGQILSDQSRDTIV 
ENIQKRLIEIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSLP 
HVHVALWI NSQGGRKVKAGDTVS YVI CQDGSNLTAS QRAYAPEQ 
LQKQDNLTIDTQ YYIiAQQIHPWARI CE P I DG I DAVL IATGWEI* 
\DPTQFKVHHYHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 
TCGTENI YDNVFDGSGTDMEPSLYRCSNIDCKAS PLTFTVQLSN" 
KLIMD I RR FI KK YYDGWL I CEEPTCRNRTRHLPLQFSRTGP LCP 
ACMKATLQPEYSDKSLYTQLCFYRYIFDAECALEKLTTDHSKDK 
LKKQFFTPKVLQDYRKLKNTAEQFLSRSGYSEVNLS KLFAGCAV 
K3 


S936 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHRSRAWTCYIAI 
RMIiMATCCPSPTTTACTGPWQRAPPLRLLVQKREADSSGLAFAS 
NSLC RRK KGLLLR PVAPLRTRP PLL I S L PQDFRQVS SVI DVDLI* 
PETHRRVRLHKHGSPRPLGFYIRDGMSVRVAPQG\LERVPGIF1 
SRLVRGGLAESTGLLAVSDEI LE VUG IEVAGKTLNQVTDMMVAN 
SHN\L1VTVKPANQRNNWRGASGRLTGPPSAGPGPAEPDSDDD 
S S DLVI E NRQ PP SSWGLSQG P PCWDLHPG CRHPGTRSS LPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


1600 


PTSLLKSTVQLMCRLLQDKRYQCVYSLAEIFKVLASFYVrLVIL 
YGLTSSYSLVWMLRSSLKQYSFEALREKSNYSDIPDVKNDFAFI 
LHLADQYDPLYSKRFSIFI^EVSENKLKQINLNNEWTVEJaKSK 
LVKNAQDKI B LHLFMLNGLPD WFBLTEMEVLS LELI PE VKLPS 
AVSQLVNLKE LRVYHS 3 LWDHPALAFLEENI* KILRIjKFTEMGK 
IPRWVFHLKNLKEJjYLSGCVLPEQLSTMQLEGFQDLKNIiRTLYL 
KSSLSRIPQVVTDLLPSLQKLSLDNEGSKLVVLNNLKia^VNLKS 
LELISCDLERIPHSIFSL^LHELDLRENWLKTVEEIISFQHLQ 
NLSCLKLWHNMIAYIPAQIGaLSNLEQLSLDHNNIENLPLQLFL 
CTKLHYLDLSYNHLTFIPEEIQYZ,\SNLQYFAVTNNNIBMLPI)G 
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ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

1 or* a t" \ on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B* 
Glutamic Acid, FsPhenyl alanine, GeGlycine, 
H=Histidlne, I=Isoleucine, K=Lysine, 
LaLeucine, M»Methionine, N«sAsparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valinc, 
W-Tryptophan, Y«Tyrosxne, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LFQCKKLQCLLLGKNSIjMNLS PHVGELiSNIjTHREPI g \ n yletl 
PPELEGCQSLKRNCLIVEENLIJ»TLPLPVTERI>QTCLDKC 


5938 


395 


185* 


ykgegffcnqeargerrkkkkamsspniwstgssvystpvfsqk" 

MTVWILLLLSLYPGFTSQKSDDDYBDYASHKTWVLTPKVPBGDV 
T7ILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 
YTIDIFFAQTWYDRRIaKFNSTIKVLRLNSNMVGKIWIPDTFFRN 
SKKADAHWITTPNRMLRIWNDGRVLYSLRLTIDAECQI*QI*HNFP 
MDEHSCPfcEFSS YG Y PR E EIVYQ WKRSS VE VGDTRS WRL YQFS P 
VGLRNTTBVVKTTSGDYVVMSVYFDLSRRMGYFTIQTYI PCTLI 
WLS WVS PWINKDAVP ARTS LG I TTVLTMTTLS TI ARKS I*P KVS 
YVTAMDLFVSVCFIPVFSALVBYG\TLHYPVSNRKPSKDKDKKK 
KNPAPTIDIRPRSATIQMNNATHLQERDEBYGYECLDGKDCASF 

FCCFEDCRTGAWRHGRIHIR.IAKMDSYAR1FFPTAFCLFNLVYW 
VSYLYL 


5933 


66 


1404 


I RPGYUKEVQENS PGH RAG LEP FFDF I VS HTgS RJLNKDKDTLKD 

LLKANVEKPVKMLIYSSKTLELRETSVTPSNLWGGQGX.LGVSIR 

FCSFDGANENVWHVLEVESNSPAALAGLRPHSDYIIGADTVMNE 

SBDLFSLIETHEAKPLKLYVYNTDTDNCREVIITPNSAWGGEGS 

LGCGIGYGYLIIRIPTRPFEEGKKISIiPGQMAGTPITPLKDGFTE 

VQI£SVNPPSLSPPGTTGI3QSLTG1*SISSTP\PAVSSVLSTGV 

PTVP\LLPPQVNQSLTSVPPMESSyLHLPGLMPFTRQGLPNLPQ 

PSTFNLPR\PTHS?CPGVGLYQBFVKPGVLPPLSSMPPRNLPG\I 

APLPLPSEFLPSFPLVPBSSSAASSGELLSSLPPTSNAPSDPAT 

TTAKADAAS fl LT VDVTP PTAKAPTTVEDR VGDSTP VSEKP VSAA 
VD ANAS ESP 


59ao 


145 


717 


RRSASRSAS PRQS AGTAVTTGTRAGGTC LAAAHHRMRWRADGRS 
LEKLPVHMGLVITEVEQBPSFSDIASLWWCMAVGISYISVYDH 
C^IFKRNNSRLMDEILKQQQELLGLDCSKYSPEFANSNDKDDQV 

LNCHI»AVKVIjS PEDGKAD I vraaqdfcqlvaqkqkrptdldvdt 

IA\VYLVQMWLILI 


5941 


13 


6147 


MCl^RMGASSPRSPEPVGPPAPGLPFCCCGSLtJvVvVLLALPVA"' 

wgqcna?em\lpfarptnltdefefpigtylnyecrpgysgrpf 

S 1 1 CIrfWS VWTGAKDRCRRKS CRNPPDP VNGMVHVI KG I QFGS Q 
I KYS CTXGYRL IGS SSATCI I SGDTV I WDNETP ICDR I PCGLP P 
TITNODFISTWREWFHYGS WTYRCNPGSGGRKVFEL VGEPS I Y 
CTSNDDQVGIWSGPAPGCI I PNKCTPPNVENG 1 LVS DNRSLFS L 
NEWEFRCOPGFVMKGPRRVKCQALNKWEPELPSCSRVCQPPPD 
VLHAERTQRDKDNFSPGQBVFYSCEPGYDLRGAASMRCTPQGDW 
S PAAPTCE VKS CDD FMGQLLNGRVLFP VNLQLGAKVD FVCDEG F 
QLKGSSASYCVLAGMESLWNSSVPVCBQIFCPSPPVIPNGRHTG 
KPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 
VWSS PAPRCG1LGHCQAPDHFL FAKLKTQTNASD FP IGTSLKYE 
CRPEYYGRPFSITCLDNLVW33PKDVCKRJCSCKTPPDPVNGMVH 
VITDIQVGSRINYSCTTGHRL IGHSSAECI LSGHAAHWSTKPPI 
CQRIPCGLPPTIANGD F ISTNRENFH YGS WTYRCNPGSGGRKV 
FELVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKCTPPNVENGII* 
VSDNRS LFSLNE WEFRCQPG FVMKGPRRVKCQALNKWEPELPS 
CSRVCQPPPDVLHAERTQRDKDNFSPGQEVFYSCBPGYDLRGAA 
SMRCTPQGDWSPAAPTCEVKSCDDFMGOIiLNGRVLFPVNliQLGA 
KVDFVCDEGFQLKGSSASYCVIiAGMESLWNSSVPVCEQIFCPSP 
PVIPNGRHTGKFLEVFPFGKAVNYTCDPHPDRGTSFDHGESTI 
RCTSDPQGNGVWSS PAPRCG ILGHCQAPDHFLFAKLKTQTNAS D 
FPIGTSLKYRCRPBYYGRPFSITCLDNLVWSSPKDVCKRKSCKT 
PPDP VNGMVHVITD I QVGSR INYS CTTGHRLIGHSSAECILSGM 
TAHWSTKPP ICQRI PCGLPPTI ANGDFISTNR3NFKYGS WTYR 
CNLGSRGRKVFELVGEPS IYCTSNDDQVG I WSGP APQCI I PNKC 
TPPNVENG ILVSDNRSL FSLNBWEFRCQPGFVMKGPRRVKCQA 
LNKWEPBLPSCSR VCQP PPE1 LHGEHTPSHQDNFS PGQE V? YS C 
EPGYDLRGAASLHCTPQGDWS PEAPRCAVKS CDDFLGQLPHGRV 
LFPLNLQLGAKVSFVCDEGFRLKGSSVSHCVLVGMRSIiWNNSVP 
VCEHIFCPNPPAILNGRHTGTPSGDI PYGKE IS YTCDPHPDRGM 
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SEQ~~ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(Alanine, C=Cysteine, D-Aspartic Acid, E=* 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
K=Histidine, I«Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, MfaAeparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T«Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TFNLIGB S TI RCTSDPHGNGVW S S PAPRCELS VRAGHCKfrPEQF 
PFASPTIPINDFEFPVGTSLNYECRPGYFGKMPSISCLENIjVWS 
SVBDNCRRKSCGPPPEPPNGMVHINTDTQFGSTVNYSCNEGFRL 
IGSPSTTCLVSGNNVTWDXKAPICEI ISCBPPPTISNGDFYSNN 
RTSFHNGTWTYQCHTGPDGEOLFELVGERSXYCTSKDDQVGVW 
SSPPPRCISTNKCTAPBVENAIRVPGKRSFFSIjTBI irfrcqpg 
FVMVGSHTVQCQTNGRWGPKIjPHCSRVCQPPPEILHGEHTLSHQ 
DNFSPGQEVFYSCEPSYDliRGAASLHCTPQGDWSPEAPRCTVKS 
CDDFLGQLPHGRVLLPLNLQU3AKVSFVCDEGFRLKGRSASHCV 
LAGMKALWNSSVPVCEQIFCPNPPArLNGRHTGTPLGDIPYGKB 
VSYTCDPHPDRGMTFNLIGBSTIRRTSBPHGNGVWSSPAPRCEL 
PVGAACPHPPKIQNGHYIGGHVSLYLPGMTISYTCDFGYLLVGK 
GFI FCTDQG I WSQLDH YCKE VNCS FPLFMNG I S KSLEM KKVYHY 

GDYVTLKCEDGYTLEGSPWSQCQADDRWDPPLAKCTSRTHDALI 
VGTLSGTIFFILLIIFLSWIILKHRKGNNAHENPKEVAIHLHSQ 
GGSSVHPRTLQTNEENSRVLP 


5942 


4509 


saa 


^LYTOMI^PlAYGISHKAYQIDPPL\RKHREO\tVIBTVGR^ 

DK\AQMIRFEERTGYFSSTDLGRTASHYYIKYNTIETFNELFDA 

HKTEGDIFAIVSKAEEPDQIKVREEEIEEIiDTLLSNFCEIiSTPG 

GVENSYGKINI LLQTYINRGEMDSFS LISDSAYVAQNAARI VRA 

LFE IAIjRKR WPTMT YRLLN1»S KAIDKRLWGWAS PLRQFS I LP PH 

MLTRLEBKKLTVDKLKDMRKDEIGHILHHVNIGLKVKQCVHQIP 

SVMMEAFIQPITRTVLRVTLS I YADFTWNDQVHGTVGEP WWI W 

EDPTNDHIYHSE YFLALKXQVISKBAQLLVFTI PI FEPLPSQYY 

IRAVSDRWUSAEAVCHNFQHLILPERHPPHTEItLDLQPLPITA 

LGCKAYEALYNFSHFNPVQTQI FHTLYHTDCNVLLGAPTGSG.KT 

VAAELAI FRVFNKYPTSKAVYIAPLKALVRERMDDWKVRIEEKL 

GXKVIELTGDVTPDMKSIAKADLIVTTPEKWEGVSRSWQNRNYV 

QQVT3LIIDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRIVG 

LSTALAWARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 

HYCPRMASMNKPAFQAIRSHSPAKPVLI FVSSRRQTRLTALELI 

AFLATEEDPKQWLI^EREMENIIATVRDSNLKLTLAFGIGMHH 

AGLHERDRKTVEBLFVNCKVQVLIATSTIAWGVNFPAHLVIIKG 

TEYYDGKTRRYVDFPITDVLQMMGRAGRPQFDDQGKAVILVFDI 

KKDFYKKFLYEPFPVESSLLGVLSDHLNAEIAGGTITSKQDALD 

YITWTYFFRRLIMNPSYYNLGDVSHDSVNKFLSHLIEKSLIELE 

LS YCI E I GEDNRS I EPLTYGR IAS YY YLKHQTVKMFKDRIiKPEC 

STEELLSIIiSDAEEYTDLPVRHNEDHMNSELAKCLPIESNPHSF 

DS PHTKAHLI»LQAHLSRAMLP CPDYDTDTKT VLDQALRVCQAML 

DVAANQGWLVTVLNITNLIQMVIQGRWLKDSSLLTLPNIEKHHL 

HLFKKWKPIMKGPHARGRTSIECLPELIHACX3GKDHVFSSMVES 

ELHAAKTKQAWNFLSHLPEINVGISVKGSWDDLVEGHNBLSVST 

LTADKRDDNKWIKLHADQBYVLQVSLQRVHFGFHKGKPESCAVT 

PRFP KS KDEGWFLI LGEVDKRELIALKRVG YI RNHHVAS LSF YT 

PEIPGRYI YTLYFMSDCYLGLDQQYD/NLSQRYTSES PCTGQHQ 

GIj 


5943 


1 


2274 

] 
1 


Ui^TRHKTYLSSSWAKMAAAEGPVGDGELWQTWLP^^ 
BGIiXNQS PTEAEKPASSSLPSSPPPQIiLTRNWFGLGGELFLWD 
GEDSSFLWRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 
LS PTQHHVALI GI KGLMVLELP KRWGKNSE FEGGKS TVNCSTTP 
YAER FFTS STSLTLKHAAWYPSE I LDPHWLLTSDNVI RI YSLR 
EPQTPTNV 1 1 LS EAEEESLVLNKG RAYTAS LGETAVAFDFGPLA 
AVPKTLFGQNGXDEWAYPLYILYENGETFLTYISLLHSPGN/I 
WKAVGS IAHAS \AAEDNYG YDACAVLCLPCVPN I LVI ATESGML 
YHCVVLEGEEEDDHTSEKSWDSRIDLIPSLYVFECVELELALKL 
RSGEDDPFDSDFSCPVKLHRDPKCPSRYHCTHEAGVHSVGLTWI 
HKI^HKFLGSDEEDKDSLQELSTBQKCFVEHILCTKPLPCRQPAP 
IRGFWIVPDILGPTMICITSTYBCLIWPLLSTVHPASPPLLCTR 
3DVEVAES PLRVLAETPDS FEKHIRS I LQRS VANP AFL KAS EKD 
CAPPPEECLQLLSRATQVFREQY ILKQDLAKEE I Q RRVKLLCDQ 
qOCQLEPLSYCREERKSLREMAERIADKYEBAKEKQEDIMNRMK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
HoHistidine, I=lsoleucine, K«Lysine, 
L=Leucine, Methionine, N-Asparagine , 
P* Proline, O^Glutaraine, R^Arginine, 
S»Serine, TVThreonine, V»Valine, 
W tryptophan, Y^Tyrosine, X-Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








kllhsfhselpvlsdserdmkkelqlipdqlrhlgnaikqvtM^ 

KD YQQQKMEKVLS LPKPT I ILS AYQRKCIQS I LKEEGEH I RBMV 
KQINDIRNHVNF 


5944 


1S7 


342B 


FS I ATFTDEPE VLTEPPS ATTTTTIG I SATWTTLAGSHGKRNNT 
ITTTSSKRKNRKNKITPENVQIIPDDPLPISYSQPEKVNGBSKS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAVVTTTVSS 
KKQPS VLVTPPKE ERKS VSGKASI KLS ETISEGTSKSLSTCTKS 
GPSPLS S PWGKLT VAS PKRGQKRE EGWKEWRRS KKVSVPSTVI 
SRVIGRGGCNINAIR6FTGAHIDIDKQXDKTGDRIITIRGGTB3 
TRQATQL INAL I KDPDXE I DELI P KNRLKS SSANS KIGSS APTT 
TAANTSLMGI KMTTVALSSTS QTATALTVPAIS SASTHKTIKNP 
VN\NVRPGFPVSFP\LAYPPPQFAHALIiAAQTFQQIRPPRLPMT 
HFGGTF PPAQSTWGPPFVRPLSPARATNS P KPHMVPRHSNQNSS 
GSQVNSAGS LTSS PTTTTS S SASTVPGTSTNGS PSS PSVRRQLF 
VTWKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSS ? 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSBQEAGSPPWET 
TNTRPPNS SSSSGS S SAHSNQQQPPGS VSQEPRPPLQQSQVPP P 
EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 
PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLSTQSACQNSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 
SAHAFWGGSWSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 
FRPPLQRPAPSPSG I VNMD5? PYGSVTPS STHLGNFASNISGGQM 
YGPGA PLGGAPAAANFNRQHFS PLSLLTPCSSASNDS SAQS VSS 
GVRAPSPAPSSVPLGSEKPSWVSQDRKVPVPIGTERSARIRQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 
IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 
VGGMPFSVYGNAMIPPVAPIPDGAGGPIFNGPHAADPSWNSL1K 
MVS SSTENNGPQTVWTG PWAPHMNS VHMNQLG 


594S 


1461 


197 


GVTHLFL FGKRKLRNG I AEDLKGQADFF FLI*VSEA WATGSPRA 
WLTCL1LPLPGIIFSVLPKAMSRPLLITFTPATDPSDLWKDGQQ 
QPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARER 
KRKKRR IMKAPAAEAVAEGASGRHGQGRSIiEAEDKMTHRI LRAA 
QEGDLPELRRLLEPHEAGGAGGITINARDAFWWTPLMCAARAGQO 
AAVS YLLGRGAAWVGVCELSGRDAAQLAEEAGF PE VARMVRESH 
GETRSPENRS PTPSLQYCENCDTHFQDSNHRTSTAKLLS LSQGP 
QPPNLPIXJVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTV 
LKRDQEGLGYRS APQ PRVTHFPAWDTRAVAGRE \TPPRVATLS W 
REERRREE \ KDRAWERDLRTYMNLEF 


5946 


541 


1666 


IliGSYSSIQPEEYS \SWC\EWLQDI*IiA\YVSPK\HSYLRDLP 
SEGS PQRVNS IDFV\ EL\ EHLQPDVliVHAVLR WDF / TI LTEAV 
YS YRGQKQKKVMLTVEQAQDQHYALVLWGPGAAW \ YPQLQRKKG 
YIWEFKYLFVQCNYTLENLELHTTPWSSCECLFDDDIRAITFKA 
KFQKSAPS FVKISDLATHLEDKCSGWL IKAQISELAFP ITASQ 
KIALNAHSSLKS I FSSLPNI VYTGCAKCGLEI.ETDENRI YKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCL^VIVPSSEITYGMVVADLFHSLrAVSAEPCVLKIQSI.FVL 
DBNSYPLQQDFSLLDFYPDIVKHGANARL 




3 


1317 


RG I PDRRRRGP I GRVNMDLENKVKKMGLGHEQGFGAPCLKCKElt 
CEGFELHFWRKI CRNC\NVAKKSM/ TVLLSNEEDRKVGKLF3DT 
KYTTLIAKLKSDGIPMYKRNVMILTNPVAAXKNVSINTVTYEWA 
P P VQNQALARQ YMQML P KEKQPVAGS EGAQYRKKQLAKQ L PAHD 

ODPSKCHELSPUirVTOSMRnPUVlfWCfaT f'\7r r T\\rvT n<iDunit^ 
Wwr « n,MUMMr Mb V ABNOy; v IVIV. I KobnijUjVXiU V JVus'CBMDAQG 

PKQMN I PGGDRS TPAAVGAMEDKS AEHKRTQ YS C YCCKLSMKEG 
D PAI YAERAG YD KLWHPACF VCSTCHELLVDMI Y FWKNBKLYCG 
RHYCDSEKPRCAGCDELIFSNEYTQAENQNWHLKHFCCFDCDSI 
U\GEIYVMVNDKPVCKPCYVKNHAVVCQGCHNAIDPEVQRVTYN 
NFSWHASTECFliCSCCSKCLIGQKFMPVEGMVFCSVECKKRMS 


594 8 


39 


3370 


yrerypvsggsvlrsalevcWdflsgltegsllpegffsgpidq 

GNHYQMRRKGliCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VEIEIEGRLHRIS I FDPLE I ILEDDLTAQEMSE CMSNKENSERP 
PVCLRTKRHKNNRVKKK^EALPSAHGTPASASALPEPKVRIVEY 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 
1 amino acid 

residue of 
1 amino acid 
| sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteina, D^Aspartic Acid, B- 
Glutamic Acid, F= Phenyl al ani ne , G-Glycine, 
H=Histidine, I=»Isoleucine, K-Lysine, 
L=Leucine, M-Methionine, N=As P aragine , 
P=»Proline, Q=Glutamine, R»Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SP^SAPRRPPVYYKFIEKSAiiKLDNSVKyDMDEBDYAWLBrVNB 
KRKGDCVPAVS QSMFE FLMDRFEKE SHCENQKQGEQQS LIDEDA 
VCCICMDGECQNSNVILFCDMCNLAVHQECYGVPYIPEGQWIiC/ 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW\ I P 
BXVGFANTVFIEPIDGVRNIPPARWKLTXCNLCKEKGR/VGACl 

ochkancytafhvt:caqkaglymkmepvkeltgggttpsvrkta 

YCDVHTPPGCTRRPLNI YGDVEMKNGVCRKESS VKTVRSTS KVR 
KKAKKAKKALAEPCAVLPTVCAPYIPPQRLNR1ANQVAIQRKKQ 

fverahsywllkrlsrngapllrrlqsslqsqrssqqrendeem 

KAAKEKLKYWQRLRHDIiERARLL IELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSIiKEVPDYLDHI 
KHPMDFATMRKRLEAQGYKNtiHEFEEDFDI»I IDNCMKYNARDTV 
FYRAAVRLRDQGGWLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWBDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRS KRAKLLKKE IALLRNKLSQQHSQPLPTG PGLEGFEEDGAAL 
G PEAGB E VLPRLETLLQPRKRSRS TCGDSE VEEE S PGKRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSrsSSNSPLCDSS 
FNA PKCGRGKPAL VRRHTLEDRS EL I S CI BNGNYAKAAR IAAEV 
GQSSM WISTDAAASVLEPLKVVWAKCSGYPSYPALI I DPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMOTKSDEKLFLVLFFDNKRSWQ 
WLPKS KMVPLGI DETIDKLKMMEGRNSS IRKAVR IAFDRAMNHL 
SRVHGEPTSDLS DID 


5949 
5950 r 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 

GiraYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETIirVAQAQ^M 

VEIBIEGRLHRISXFDPLEIILEDDLTAQEMSECNSNKEWSERP 

PVCLRTKRHKNNRVKKKNEALPSAHGTPASAS ALPEPKVR I V3 Y 

SPPSAPRRPPVYYKFIEKSAEELDNEVBYDMDEEDYAWLEIVNE 

KRKGDCVPAVSQSMFEFLMDRFBKESHCENQKQGEQQSLIDEDA 

VTCICNDGECQNSWVILFCDMCNLAVHQECYGVPYIPEGQWLC/ 

RAHCXQSRARPADCVLCPNKGGAFKKTDDDRWGHV\VCALW\IP 

E\VGPANTVFIEPIDGVRWIPPARWKLT\CNLCKBKGR/VGACI 

QCHKANCYTAFHVTCAQKAGLYMKMBPVKELTGGGTTFSVRKTA 

YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 

KKAKKAKKALAEPCAVLPTVCAP Y I PPQRLNRI ANQVAI QRKKQ 

FVERAHSYWIiLKRLSRNGAPIJuRRLOSSLQSQRSSQQRENDEEM 

KAAKEKLKYNQRLRHDLBRARLLrELLRKREKLKREQVKVEQVA 

MELRLTPLTVLLRSYLDQLQDKDPARI FAQPVSLKE VPDYLDHI 

IOIPMDFATMRKRI,BAQGYKNLHEFEEDFDLIIDNCMK*«rNARDTV 

r ik>^vk1jKIJQC^vvlrqaRREVDSIGLEEASGMHLPERPAAAP 

RRPFSWBDVDRLLDPANRAHIX3LEEQLRELLDMLDLTCAMKSSG 

SRS KRAKLLKKB IALLRNKLSQQHSQ PLPTG PGLEG FEEDGAAL 

GP EAGER VLPRLETLLQPRKRS RSTCGDSEVBEES PGKRLDAGL 

TNGFGGARSEQEPGGGLGRKATPRRRCASESS ISSSNSPLCDS S 

FNAPKCGRGKPALVRRHTLEDRSELISCIENGNYAKAARIAAEV 

GQSSMWISTDAAASVIjEPT.frwwaK'r'cwocvDTiT TTMnvummii 
w Mk r Wt ni*.uiurtnfvjvu£irijRV vw/UVLoblrb XtrALiUDP/CMPRv 

PGHHNGVTIPAPPLDVLKIGEHMQTfCSDEKLFLVLFFDNKRSWQ 

WLPKSKPJVPLGIDETIDKLK1MMEGRNSSIRKAVRIAFDRAMNHL 
SRVHGEPTSDLSDID 


5951 


1166 


373 


ESRS-jTMSTSQPGACPCQGAASRPAILYALLSSSLKAVPR?RSR 
CLCRQHRPVQLCAPHRTCREAIJJVLAKTVAFLRNLPSFWQLPPQ 
DQRRLLQGCWGPLFLLGLAQDAVTFEVAEAPVPS ILKKILLEEP 
SSSGGSGQLPDRPQPSLAAVQWLQCCLESFWSLELSPKE\YACL 
KGPILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPWCPAAQGR 
LTRVLLTASTLKS IPTS LLGDLFFRP I IGDVD IAGLLGDMLLLR 




143 j 


§449 

] 
( 
1 


^NVKPSLLWQLFKFSDKEEHEQNDStSGKTGETGVEBMlATRK ~ 

VEQDSKETVKLSHEDDHILEDAGSSDISSDAACTNPNKTENSLV 

^PSCVDEVTECNLELKDTMGlADKTEimBRNKIEPLGYCEDA 

SSNRQLESTEFNKSNLEWDTSTFGPESNILENAI CDVPDQNSK 

2LNAI ESTKI ESHETANLQDDRNSQSSS VS YLESKSVKSXHTKP 

/IHSKQNMTTDAPKKIVAAKYEVIHSKTECVNVKSVKRNTDVPES 

2QNFHRPVKVRKK0IDKBPKIQSCNSGVKSVKNQAHSVLKKTLQ 
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SEQ 
ID 
NO: 



Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



3226 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



539" 



"330" 



811 



Amino acid segment containing signal peptide" 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G-Glycine, ' 
HaHistidine, I=Isoleucine, K»Lysine, 
If-Leucine, M=Methionine, N-Asparagine , 
P»Proline, Q-Clutamine, R^Arginine, 
S=Serine, T=Threonine, VsValine, 
WoTryptophan, Y=Tyrosine, X* Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
DQTLVQlF'KP^THSLSDKSHAH PGCLKEPHHPAQ'rGHVSHSSQK 
QCHKPQQQAPAMKTNSHVKBBLEHPGVEHFKEEDKIiKLKKPEKN 
LQPRQRRSSKSFSLDEPPLPIPDNIATIRREGSDHSSSFESKYM 
WTPSKQ CGFCKKPKGNRFMVGCGRCDDW FHGDC VGLS LS QAQQM 
GEEDKE YVCVXCCAEEDKKTE I LDPDTIiENQAT VE FHSGDKTME 
CEKLGLSKHTTNDRTKYIDDTVKHKVKILKRESGEGRN3SDCRD 
NEIKKWQTAPLRKWGQPVLPRRSSEEKS EKIPKESTTVTCTGE K 
ASKPGTHEKQEMKKKKV\BKGVLNVHPAASASKPSADQ IRQSVR 
HSLKDILMKRLTDSNLKVPEEKAAKVATKIEKBLFS FFRDTDAK 
YKNKYRSLMFNLKDPKNNILFKKVLKGBVTPDHLIRMSPEELAS 
KELAAWRRRENRHTIEMIEKBQREVERRPITKITHKGEIEIBSD 
APMKEQEAAMEIQEPAANKSLEKPEGSBKXRKEEVDSMSKDTTS 
QHRQHLFDLNCKICIGRMAPPVDDIiSPKKVKWVGVARKHSDNE 
AES IADALSSTSNIiASEFFEEEKQES PKSTFSPAPRPEMPGTV 
EVBSTFLARLNFIWKGFINMPSVAKFVTKAYPVSGSPEYLTEDL 
PDS I QVGGR I S PQTVWDYVE KI KASGT KB I CWRFT P VTEBDQ I 
S YTLLFAYFSSRKR YGVAANNMKQVKDM YLI PLGATDKI PHPLV 
P FDG PGLE LHRPNLLLGLI I RQKLKRQHS ACASTS K IAETPES A 
P P IALPPDKK3 KI EVSTEEAP E EENDFFNS FTTVLH KQRNKPQQ 
NI»QEDIiPTAVB?LMEVTKQEPPXPLRFLPGVLIGWENQPTTLEL 
ANKPL P VDDI LQS LI>GTTGQVYDQ\ AQS VMEQNTVKB I P FLNEQ 
TNSKIK KTDNVEVTDGENKE IKVKVDNI SESTDKSAEI ETS WG 
SSS ISAGSLTSLS LRGKPPDVSTEAFLTNLS IQSKQEETVESKE 
KTLKRQLQEDQENNIiQDWQTSNSS PCRSNVGKGNIDGNVSCSEN 
LVANTARSPQFINLKRDPRQAAGRSQPVTTSESKDGDSCRNGEK 
HMLPGLSHNKEHLTEQINVEEKLCSAEKNSCVQQSDNLKVAQNS 
PS VBNI QTS QAEQAK PLQED ILMQN I ETVH PFRRGSAVATSHFE 
VGMTCPSEFPSKSITFTSRSTSPRTSTNFSPMRPQQPNLQHLKS 
SPPGFPFPGPPNFPPQSMFGFPPHLPPPLLPPPGFG\FA\QNPM 
VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE 
RRHSBPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 
WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 
GKASRDSRNVDKKPDKPKSEDYEKDK2REKSKHREGEKDRDRY1I 
KDRDHTDRTKSKR 



pparrsardlpralsmeaarpsgswngalcrll\lvti,\aflif 

AS DACKNVTLHVPS KLDAE KLVGRVNLKECFTAANLIHS S DPDF 
0 1 LEDGS VYTTNTI LLSS E KRSFTI LLSNTENQE K KKI F VFLEH 
QTKVLKKRHTKEKVLRRAKRRWAPIPCSMLENSLGPFPLFLQQV 
QS DTAQNYTI Y YS I RG PGVDQEPRNL FYVBRDTGNLYCTR P VDR 
BQYESFEI1AFATTPDGYTPELPLPLIIKIEDENDNYPIFTEET 
YTFTI FENCRVGTTVGQVCATDKDEPDTMHTRLKYS I IGQVPPS 
PTLFSMHPTTGVI TTTSSQLDRELIDK YQLKI KVQDMDGQYFGL 
QTTSTCI INIDDVKDHLPTFTRTSYVTSVEENTVDVBILRVTVE 
DKDLVKTANWRANYTI LKGNENGNFKI VTDAKTNEGVLCVVKPL 
NYEBKQQMILQIGWNEAPFS REASPRSAMSTATVTVNVEDQDE 
GPECNPP I QTVRM KEN AEVGTTSNG YKAYDPE TRS S S GIRY KKI* 
TDPTGWVTrDENTGSIKVFRSLDREAETIKNGIYNITVLASDQG 
GRTCTGTLGHLQDVNDNSPFIPKKTVriCKFTMSSAElVAVDP 
DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQM3PPF 
GSYWPITVRDRI,GMSSVTSLDVTIrCDCITENDCTHRVDPRIGG 
GGVQLGKWAILAILLG IALFFCILFTLVCGASGTSKQPKVI PDD 
LAQQNLIVSNTEAPGDDKVYSANGFTTQTVGASAQGVCGTVGSG 
I KNGGQETI EMVKGGHQTS ESCRGAGHHHTLDS CRGGHTEVDNC 
RYTYSEWHS FTQPRLGEES IRGHTL IKN 



PLLCNPDPGWYWWVXQESEISKESQEMDARPKLDLGFKEGQTIK 
LCIGNITMKKGGASKPRTARGGGLSLL P PPPGGKVTI P PPSS / V 

KLPSTNHVTPPSIPKSNHGGSDADXLLDLDSPAPVTTPAPTPVS 
VSNDLWGDFSTA5 SS VPNQAPQPSNWVQF 



PPPPPPKI^MADtEA^^VsVl^EKSKAT PAARASKRIVT 
PEPS IRS VMQKYLAERNEITFDXIFNQKIGFLLFKDFCLNEINE 
AVPQVKF YEEIKE YEKLDNBBDRLCRSRQI YDAYIMKELLS CSH 
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SEO 
ID 
NO; 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A-Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
HsHistidine, I»Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, NaAsparagine, 
P= Proline, Q=Glucamine, R~Arginine, 
S^Serine, T»Threonine , V«Valine, 
WoTryptophan, YoTyrosine, X«UnJcnown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PySKQAVEHVQSHLSKKQVTSTLFOPyiESICBSLRGDIFQkPM 
BSDKFTRPCQWKNVB LN IHLTMNE FSVHR I IGRGGPGE VYGCRK 
ADTGKMYANKCimKRIKMKQGETLALNERIMLSLVSTGDCPFI 
VCMTyAFHTPDKLCFILDIiMNGGDLHVHLSQHGVFSBKEMRrYA 
TEI 1LGLEHMHNRFVVYRDLKPANILLDEHGHARIS \DLGLACD 
FS KKKPHAS VGTHGYP1APE VLQKGTAyDSS ADWFSIiG OILFKLL 
RGHSPFRQHKTKDKHEIDRMTLTVKVELPDTFSPELKSLLEGLL 
CRDVSKRLGCHGGGSQBVKEHSFFKGVDWQHVYLQKYPPPr,IPP 
RGB VNAADA FDI GSFD BEDTKG I KLLDCDQELYKNFPLVT S ERW 
QQEVTETVVEAVNADTDKIBARKRAKNKQLGHEEDYALGKDCIM 
HGYMLKLGN PFLTQWQRR YFYLFPNRLEWRG EGE SRQNLLTMEQ 
ILS VEE TQ X KDKKCI LFRI KG GKQFVLQCBS DPE FVQWKKE LNE 
TFKEAQRLLRRAPKFLNKPRSGTVELPKPSLCHRNSNGL 




1726 


444 


KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR 
PANRQDVLSGWINLPVLQLTKDPLKTPGRLDHGTRTAFIHHREQ 
VWKRCINI WRDVGLFGVLNE I ANS KEEVFE WVKTASGWALALCR 
WAS SLHGS LFPHLS LRSE DLIAEFAQVTNWSSCCLRVFAWHP HT 
NKFAVALLDDSVRVYNAS STI VPS LKHRLQRNVASLAWKPLS AS 
VLAVACQSC1LIWTLDPTSLSTRPSSGCAQVLSHPGHTPVTSLA 
WAPSGGRUSASPVDAAJRVI^VSTETCVFLPWFRGGGVTNIJjW 
SPDGSKILATTPSAVFRVWEAQMWTCERWPTLSGRCQTGCWS PD 
GSRLLFTVLGEPL I YSLS FPERCGEGKG\ ALB VQSQQRLWQI CL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL 


5956 


1705 


139 


GVGVRGARAMATVQEKAAALNLSALHS PAHRP PGP'S VAQKP PGA 
TYVWS S I INTLQTQVEVKKRRHRLKRHNDCFVGSEAVD VI FSHL 
1 QNKYFGDVDI PRAKWRVCQALMDY KVFEAVPTKVFGKDKKPT 
FEDS SCS L YRFTT I PNQDSQLG KENKLY S PAR YADALFKS SD I R 
SASLEDLWENLSLKPANSPHVNISATLSPQVINEVWQEBTIGRL 
LQLVDLPLLDSLLKQQEAVPKI PQPKRQSTMVNSSNYLDRG ILK 
A YSDS QBDEWLSAAIDCS E YLPDQM WEI SRS FPEQPDRTDLVK 
ELLFDAIGRYYSSREPLLNHLSDVHNGIAELLVNGKTEIALEAT 
QLLLKLLDFQNREE FRRLL Y FMAVAANPS EFKLQKES DNRMWK 
RIFSKAIVDNKNI^KGKTDLLVLFL\MDHQKDVFKIPGTL\HKI 
VS \ VK \ LMAI QNGRDPNRDAGYI YCQRIDQRDYSNNTEKTTKDE 
LLNLLKTLDEDSKLSAKEKKK\LLGQFYKCHPDIFIEHFGD 


5957 


1479 


451 


ELQVAVAMDTLDRWKPKTKRAKRFLEKREPKLNENI KNAMLI K 
GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSI,BFF 
SKKSDCSLFMFGSHNKKRPNNLVIGRMYDYHVLDMIELGIRNFV 
SLKDIKNS KCPEGTKPML I FAGDDFDVTEDYRRLKS LLIDFFRG 
PTVSN IR LAGIiE YVLH FTALNG K X YFRS YKLLLKKS G CRT PR IE 
LEBNX3PSLDLVLRRTHLASDDLYKLSMKMPKALKPKKXKNISHD 
TFGTTYGR IHMQKQDLS KLQTRKM\ KGLKKRP AER I T3DHE KKS 
KRI KKKLMELSQ PLLFHCVLLKRI IKHQS I QSFL 


59S8 


1 


3138 

] 
] 
I 


AAAI£?MI/LWFPACQAFNLDVEXLTVYSGPKdSYFGYAVDFHIPD 
ARTASVLVGAPKANTSQPDIVEGGAVYYCPWPAEGSAQCRQIPF 
DTTNNRKI RVNGTKE P IEFKSNQWFG\ ATVKA\HKGKSCGPVAP 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAVAEFSPCGNSNADP 
EGQGYCQAGFSLDFYKNGDLIVGG PGSF YWQGQVITASVADI IA 
NYSFKDILRKLAGBKQTBVAPAS YDDS YLGYSVAAGEFTGDSQQ 
ELVAG I PRGAQNFGYVS I INS YDMTFIQNFTGBQMAS YFGYTW 
VSDVNSDGLDDVLVGAPLFMEREFESNPREVGQIYLYLQVSSLL 
FRDPQ ILTGTETFGRFGSAMAHLGDLNQDGYNDIAIGVPFAGKD 
QRGKVL I YNGNKDGLNTKPFP KFCQGVWASHA7PSGFGFTLRGD 
SDIDKNDYPDLI VGAFG TGKVAVYRAR P WTVDAQLLLHPM I IN 
LENKTCQVPDSMTSAACFSLRVCAS VTGQS IANTIVLHAE VQLD 
S LKQ KGAI KRTLFLDNHQ AHRVFPLVI KRQKSHQCQDFI VYLRD 
ETEFRDKLSPINISLN YSLDESTFKEGLEVKP ILNYYRENIVSE 
Q AHI LVD CGEDNLCVPDUCLSARPDKHOVI IG3ENHLMLI INAR 
tIEGEGAYEAELFVMI PEEADYVG I ERNNKG FRP LSCEYKMENVT 
RMWCDLGNPMVSGTNYS LGLRFAVP RLEKTNMS INFDLQ IRS S 
H KDNPDSNFVS LQINITAVAQVE I RGVSHP PQI VLP I HNWEPEB 
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SEQ— 
ID 
NO: 


rrectLccea 
beginning 
nucleotide 
location 
corre sp ondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
icfiiuiie or 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C-Cysteine, D=Aspartic Acid, Ba 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
HaHistidine, I=>Isoleucine, K»Lysine, 
L=Leucine, M=Nethionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=sArginine, 
S=Serine, T=Threonine, V= Valine, 
WsTryptophan, Y»Tyrosine, X«Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHKEEEVGPliVEHIYELElNIGPSTISDTILKVGWPFSARDBFlj 
LYI FHIQTLGPLQCQPNPNTIMPQD IKPAASP3DTPELSAFLRNS 
TI PHLVRKRDVHWEFH RQSPAKI LNCTN I ECLQI S CAVGRLE G 
GESAVLKVRSRLWAHTPLQRKNDPYALASLVSFEVKKMPYTDQP 
AKLPEGS I A I KTS VI WATPNVS FS IPLWVI ILAILLGLLVLAlL 
TLALWKCG FFDRAR PPQ EDMTDREQLTNDKTPEA 


595S 


1 


1166 


GTSGYAAQQLPSIiLKERJEFHLGTLNtCVFASQWLNHRQVVCGTKC 
NTLFWD VQTS Q I TKI P ILKDREPGGVTQQGCGIHAIELNPSRT 
IiLATGGDNPNS LAI YRLPTLDPVCVGDDGHKDWIFS IAW ISDTM 
AVSGSRDGSMG L WE VTDDVLTKSDARHNVS RVP VYAHI THKALK 
DI PKEDTNPDNCKVRAIiAFNNKNKELGAVS LDGYFHLWKAENTL 
SKLLSTKLPYCRENVCIAYGSEWSVYAVGSQAHVSFLDPRQPSY 
NVKS VCSRERGSGIRS VSFYEHI ITVGTGQGSLLFYDIRAQRFL 
EERL5ACYGSKPRIAGENLKLTTG\KGWLNHI)ETWRNYFSDIDF 
FPNAVYTHCYDSSGTKlr FVAGGPLPS GLHGNYAGLWS 




2853 


870 


FVWSDGGPRPKRGPAVGAGAAKLSDPWAMTPGTANRATNPLNKE " 
LDWAS INGFCEQIiNEDFEGPPLATRLLAHKIQS PQEWEAIQALT 
VLETCMKSCGKRFHDEVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKILELLYSWTVGLPEEVKIAEAYQMLKKQG\IVKSDPKLPDDT 
TFPLPPPRPKNVI FEDEE KSKMLARLLKS SHPB DLRAANKLI KB 
KVQEDQKRME K I S KRVNAI EEVNNNVKLLTEMVMSHS QGGAAAG 
SS EDL\MKB L \ YQRCJ2RMR PTLFPTGR VDTEDND\EALAE I LQA 
NDNLTQVINL YKQ LVRGEE VNGDATAGS I PGSTSALLDLSGLDL 
PPAGTTYPAMPTRPGEQASPEQPSASVSLLDDELMSLGLSDPTP 
PSGPSLDGTGWNS FQSS DATE P PAPALAQAPSMESRP PAQTSLP 
ASSGLDDLDLLGKTLLQQSLPPESQQVRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LBS IKPSNILPVTVYDQHGFRILFHFARDPLPGRSDVLWWSM 
LSTAPQ PIRN1 VFQSAVP KVMKVKLQP PSGTELPAFKPI VHPSA 
ITQVLLLANPQKEKVRLRYKLTFTM9DQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SGE PRPEPGNMATCIGEK I EDFKVGNLLGKGSFAG VYRAES IHT"~ 
GLEVAI KMIDKKAMYKAGMVQRVQNE VKIHCQLKHPSI LE L YN Y 
FEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHQIIT 
GMLYLHSHGILHRDLTLSNLLLTRNMWIKIADFGLATQLKMPHE 
KH YTLCGTPNYI S P BIATRS AHGLESDVWS LGCMFYTLL IGR PP 
FDTDTVKNTLNKWLAD YBMPTFDS IEAKDLIHQLLRRNPADRL 
3LSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASS 
STSISGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 
FYTQWGNQBTSNS GRGRV I QDAEERPHS RYLR RAYSS DRSGTSN 
SQSQAKTYTMERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 
NFFKEKTS SSSGS FERPDNNQALSNHLCPGKTPFPFADPTPQTE 
WQQWFGNI^INAHLRKTTEYDSISPNRDFQGHPDLQKDTSKNA 
WTDTKVKKNSDAS DNAHS VKQQNTMKYMTALHS KPE I IQQE C VF 
GSD PLSBQ SKTRGM3PPWG YQNRTLRS I TSPLVAHRL KP IRQKT 
KKA WS I LDSE EVCVELVKE YAS QE YVKE VLQ IS SDGNTI T I YY 
PNGG\RGFPIA\DRPPSPT\DNISR\YSF\DKLPEKYWRKYQYA 
SRFVQLVRSKSPKITYFTRYAKCILMEWSPGADFEVWFYDGVKI 
HKTED F IQVTE KTGKS YTLKS E3EVNS LKEE IKM YMDHANRGHR 
ICLALBS I ISEEERKTRSAPFFPI I IGRKPGSTS5PKALS PPPS 
VDSNY PTRDRASFNRMVMHS AAS PTQAP I LNPSM VTN3GLGI/IT 
TASGTDISSNSLKDCLPKSA0LLKSVFVKNVGWATO\r»T^rtavw 
VQFNDGSQLWQAGV S S I S YTSPNGCA TTR\ YGBNEKLPDYI KO 
KLQCLSS I LLMFSNP TPN FH 


5962 


20 


2447 


RVCSSSASTA£QAV>^AWEEIRRlAADFQRAQFAEArQRiSER'~ 
NCIEIVNKLIAQKQLEVVHTLDGIOBYITPAQISKEMRDELHVRG 
GRVNIVDLQQVINVDLIHIENRIGDIIKSEKHVQLVLGQI.IDEN 
YLDRLAEEVNDKLQES GQ VT ISELCKT YDLPGNFIiTQALTQRLG 
RI ISGH I DLDNRG V I FTEAFVARHKAR I RGLFS A I TRPTAVNSI* 
IS KYGFQBQIiLYS VLEELVNSGRLRGT WGGRQDKAVFVPD I YS 
RTQSTWVDSFFRQNGYIiEFDALSRLGI PDAVSYIKKRYKTTQLL 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide""] 
(A-Alanine, C«Cysteine, D=Aspartic Acid, E» j 
Glutamic Acid, F* Phenyl alanine, G=Glycine, J 
H=Histidine, I=»Isoleucine, K=Lysine, | 
L=Leucine, M=Methionine , W^Asparagine, 
P« Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) [ 








PLKAACVGQGLVDQVEASVEEAISSGTWVDIAPLLPTSLSVBDA "1 
AILLQQVMRAFSKC2ASTWFSDTVWSEKF\lNDCTELFRELMH 
QKAEKEMKNNPVHLITEEDLKQISTLESVSTSKKDKKDERRRXA 
TEGSGSiMRGGGGGNARETFCI KKVKKKGRKDDDSDDESQS S HTGK 
KKPE I S FMPQ DE I ED FLRKHIQDAPEEFI S EIAE YLI KPLNKTY 
LBWRSVFMSSTTSASGTGRKRTIKDLQEEVSNLYNNIRLFEKG 
MKF FADDTQAALTKHLLKS VCTD ITNLI FNFLAS DLMMAVDDPA 
AI TSE I RKK I LS KLSEETKVALTKLHNSLNE KS I EDFI S CLDSA 
AEAa>I^fVKRGDKKRBRQ^LFQHRQAIAEQLKVTEDPALILHLT 
SVLLFQFSTHSMLHAPGRCVPQI IAFLNSKIPEDQHALLVKYQG 
LWKQL VSQS KKTGQG DYP LNNE LDKEQEDVAS TTRKE LQELSS 
SIKDLVLKSRKSSVTEE 


5963 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGA?GMP\GLMGSN 
GS PGQPGTPGS KGSKGEPG IQGM PGASGLKGEPGATGS PGEPG Y 
MGLPG I QGKKGD JCGNQGEKGI QG Q KG ENGRQ G I PGQQG IQGHHG 
AKGE RGEKGE P G VRG AI GS KGES GVDGLMGPAG PKGQPGDPG PQ 
GPPGLDGKPGREFSEQFIRQVCTDVIRAQLPVLLQSGRIRNCDH 
CLSQHGSPGI PGPPGPIGPEGPRGLPGLPGRDGVPGLVGVPG.^P 
GVRGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGISKEG 
PPGD P GLPGKDGDHGKPGI QGQPGPPG I CD P SLCFS VIARRD P F 
RKGPNY j 


5964 


3 


2147 


SCRTRGRl^PI^PR3AGS5RGSRAR^EPPRPGGME2AC^VQTflH 
RGDP HELRNI FLQ YASTE VDGERYMT PEDFVQRYLG LYNDPNSN 

PKIVQLIiAGVADQTKDGLISYQEFIiAFESVLCAPDSMFIVAFQL 
FDKSGNGEVTFENVKEIFGQTIIHHHIPFNWDCEFIRLHFGHNR 
KKHLNYTEFTQFLQELQLEHARQAFALKDKSKSGMISGLDFSDI 
MVT IRSHMLTP FVEENLVSAAGGS I SHQV SFSYFNAFNS LLNNM 
ELVRKI YSTLAGTRKDAEVTKEEFAQSAI RYGQATPLEIDILYQ 
LADLYNASGRLTLAD I E R I APLAEGALP YNIAE LQRQQS PGLGR 
PI WLQ I AESA YRFTLGSVAGAVGAXAVYP IDLVKTRMQNQRGSG 
SWGELMYKNS PD CFKKVLR YEGFFGL YRGLI PQ LIGVAPEKAI 
KI*TVKDFVRDKFTRRDGSVPIjPAEVIiAGGCAGGSQVI FTNPLE I 
VKERLQ VAGE I TTGPRVSALNVLRDLGI FGLYKGAKACFLRD I P 
FSAI YF PVYAHCKLLLADENGHVGGLNLLAAGAMAG\ VPAAS LV 
TPADVI KTRLQVAARAGQTTYSGVIDCFRKI L\REEGPSAFWKG 
TAARVFRSS PQFG \ VTL VTYELLQRG FYIDFT3GL KPAGSEPTPK 
S RI ADL PPANPDH IGG YRLATATFAG I ENKFGLYLP KFKS PS VA 
WQPKAAVAATQ j 


5965 


1 


1498 


WT^YRFLPTSNMAAlOiRSLLPPDLRLQFWlJiARtQkCFLSRG j 
CGSYCAGAKASPLPGKMAMGLMCGRRBLLRLLQSGRRVHSVAGP 
SQWLGKPLTTRLLFPAAPCCCRPHYXFLAASGPRSLSTSAISFA 
EVQVQAP P WAATPS PTAVP E VASGETAD WQTAAEQS FAELGL 
GS YTPVGL I QNLLB FMHVDLGLP WWGAI AACTVFARCLI FPLIV 
TGQREAAR IHNHLPE IQKFS SR I REAKLAGDHIEYYKAS SEMAL 
YQXKHGI KLYKPLILPVTQAPI FI SFFIALREMANLPVPSIjQTG 
GLWWFQDLTVSDP I YI LPLAVTATMWAVLELGAETG VQS SDLQ W 
MRNVI RMM PL I XL P ITMHFPTAVFMYWDSSNLFSLVQVS CLR 1 p 
AVRTVLKI PQRWHDLDXLPPREGFLES FKKGWKNAEMTRQLRE 
REQ^RNQLELAARGPLRQTFTHNPLLQPGKDNPPNI PSS\SSS 
SSKPKSKYPWHDTLG | 




SSU 


102 


1925 


rskqvmarltkrrqadtkaiqhlwaaieiirkqkqianidritkH 

YMSRVHGMHPKETTRQLSIiAVKDGH VETLTVGCKG S KAGI EQB 

gywlpgdeidwetenhdwycfechlipgevi,icdlcfrvyhskcl 1 
s de frlrds s s pwqcp vcrs i kkkntnkqemgtylrfi vs rm ke 
raidlnkkgkdnkhpmyrrlvhsavdvptiqekvnegkyrsyee 
fkadaqlllhntvifygadseqadiarmlykdtchel\delqlc 

KNCF YLANAR PDNW FCYPCI PNHELDWAKMKQFGFWPAKVKQKE 
DNQVDVRFFGHHHQRAWI PSENIQD I TVNIHRLHVKRSMG WKKA 
C3)BLELHQRFLREGRFWKSKNEDRGEEEABSSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTFGCLSASS PRMLHRSTQTTNDGVCQSMCHDKYTKI FNDF | 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
aroino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


. =«=y»«3"'- concaan-ng signal peptide 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E» 
Glutamic Acid, F- Phenyl alanine, (^Glycine 
K»Histidine, I-Isoleucine K-r.vflino 
L-Leucine, M=Methionine, N=Asparagine 
P«Proline, Q=Glutamine, R«Arginine, 
S=Serine, TsThreonine, V=Valine, 
W«Tryptophan, YsTyrosine, X-Unknown, *»stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


5967 






^RMKSDHKkifiravVRSALE^^ 

EMDRKCKQVKE KCKEEPVEE I KKLATQHKQL I SQTKKKQ WCYN C 
EEEAMYHCCWNTSYCS I KCQQEHWHAEHKRTCRRKR 




102 


1925 


KbKQVMARIiTKRRQADTKAIQHLWAAlEI^RNQKQIAWIDRITK 
YMSRVHGMH PKETTRQLS LAVKDGIiI VETLTVGCKGS KAG I EQB 
GYWLPGDEIDWBTENHDMYCFBCHLPGEVLICDLCPRVYHSKCL 
SDEFRLRDSSSPWQCPVCRSIKKKNTNKQEMGTYLRFIVSRMKE 

v^*****VNXMtr m i KKuVHSAVuVPTIQEKVNEGKYRS YEB I 
FKADAQLLLHNTVIFYGADSEOADIARMI.YKDTCHEI*\DELQLC 
KNCFYLANARPDMWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGR FWKS KNEDRGEEEAESS IS STSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQBIPTMPQPIEKV 
SVSTQTKKLSASSPRMLHRSTQTTNDGVCQSMCHDKYTKIFNDF 
KDRMKSDHKRETBRWREALEKLRSEMBEEKRQAVNKAVANMQG 
E^RKCKQVKEKCXEEFVEEIKKLATQHKQLISQTKKKQWCYNC 
EEEAMYHCCWNTS YCS I KCQQEHWHABHKRTCRRKR 




81 


1288 


v Kr rKK^laAPFTVIjrPGRQQGVFLGPQRPGSEPDI PARGQPHPP 
RPVG VSTSAQAQVQPPAMHRRRLALG LG FCLLAGTSLS VIjWVYT* 
ENm>PVSYVPYYLPCPEIFNMKLHYKREKPLQPVVMSQYPQPKL 
LEHRPTQLLTLTPWUVPIVSEGTFMPELLQHIYQPLNLTIGVTV 
PAVGN/HFLBS AEEFFKRGYRVHY YI FTDNPAAVPGVPLGPHRL 

c AuuriiiiwLt. 1 iMRRMETISQHIAKRAKREVDYXFCLDVD I 
M VFRNPWGPETLGDL VAA1 HP S YYAVPRQQFpy E R RR VSTAFVA 
DSEGDFYYGGAVFGGQVARVYEFTRGCHMAILADKANGII4AAWR 

ESSHLNRHFISNKPSKVLSPEYLWDDRXPQPPSLKLIRFSTLDK 
DISCLRS 


5970 


1126 


533 


DVGFNIKRKkUJLbVFiiESPRKPSGRUiJRAPEKyRRIAANKCLC 
TG VREGE P PS /TTS QKVKEAGRD FTYL I WLFG I S 1 TGGLFYT I 
FK3LFSSSSPSKIYGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRRQHVRFTEYVKDGLKHTCVKFYIEGSEPGKQGTVYAQVKEP3P 
GSGEYDFRY1 FVE IES YPRRTI I I EDNRSQDD 




316 

1 


4712 

; 

i 
i 
i 
c 
f 
i 


S QDNIGHRXtLOKHGW KLGQGLG KSLQGRTDP IPX WJCYDVMGMG "" j 
RMEMKU3YAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 
KALEDLRANF YCEL CD ICQ YQKHQE FDNHINS YDHAHKQRLKDLK 
QRE FARNVSS RSRKDE KKQE KALRRLHELAEQRKQAE CAPGSGP 
MFTCPTTVAVDEEGGEDDKDBSATNSGTGATASCGLGSE FSTDKG 
GPFTAVQITNTTGLAQAPGLASQGISFGI KNNLGTPLQKLGVSF 
SFAKXAPVKLE3IASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 
GDSDGSSNLDGKKEDEDPQDGGSLASTLS KLKRMKREEGAGATE 
PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 
KKGSSPKPKSCIKAAASQGAEKTVSEVSEQPKETSMTEPSEPGS 
KAEAKKALGGDVSDQSLESHSQKVSBTQMCESNSSKETSLATPA 
GKESQEGPKHPTGPFFP VLS KDESTALQWPS ELLI FTKAE D S IS 
YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGLDPGE 
PNKSKEVGGBKIVRSSGGRMDAPASGSACSGLMKQEPGGSHGSB 
TEDTGRSL PS KKERSGKSHRHKXKK KHKKSS KHKRKHKADTEEK 
SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 
PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 
KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 
KSPSQYSBEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSSSS 
DASSDQSCYSRQRfiYSDDSYSDYSDRSRRHSKRSHDSDDSDYAS 
3 KHRSKRHKYSSSDDDYSI.SCSQSRSRSRSHTRERSRSRGRSRS 
3SCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 
CRSWGHESPEERHSGRRDFI RSKIYRSQSPHYFRSGRGEGPGKK 
5DGRGDDS KATG PPS QNSNIGTGRG SEGDCS PBDKNS VTAKLLL 
IKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 
*GNKPVLPL IGKLPATRKPNKKCEESGL3RGEBQEQS ETEEG P D 
JSSDALFGHQFP\S EETTGPLLDPP PEES KSGBVTADHPVAPLG 
1 PAHFDCYLGDPTI SHNYLPDPSDGNTLBSLDS S SOPGP VBS S L 
iPIAPDIiBHFPSYAPPSGDPS IESTDGABDA\SIiAPLBSQPI TF 
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ID 
NO: 


I Predicted 
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nucleotide 
1 location 

corresponding 

to first 
1 amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid • 
sequence 


Amino acio segment containing signal peptide 
i/i^iianine, c-cystelne, D*Aspartic Acid, Ba 
Glutamic Acid, F- Phenyl alanine, Glycine, 
H«Hiotidinc, I=Isoleucine, K»Lysine, 
L=I*eucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine. T=ThrenrH u-v a i<«„ 
^Tryptophan, Y=Tyrosine, X-Unknown, *«stop 
Codon, /-possible nucleotide deletion, 
\*»possible nucleotide insertion) 


5971 






~ TrEEMEXYSK^QAAQQHIQQQI^AKUVXAKPASAAI^ATPAlT^ 
QPIHICK2PATASATSITTVQHAILQHHAAAAAAAIGIHPHPHPQ 
PLAQVHHIPQPHLTPISLSHLTHSI I PGHPATFLASHPIKIIPA 
SAI HPGPFTFHPVPHAAL Y PTL LAPR PAAAAATALHLHPLLHP I 
FSGQDLQH P PSHGT 


5972 




2149 


5KUYFVGVDMDNP I GNWDGRFbG VQ L CS FACVBST ZLLH IND 1 1 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RS ELFYTLNGSSVDSQPQS KSKNTW YI DEVAEDPAKSLTE ISTD 
FDRSSPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGSIGHSPL 
SLSAQSVMEELNTAPVQBSPPLAMPPGNSHGLEVGSLAEVKENP 
PFYGVIRWIGQPPGLNEVLAGLELEDBCAGVCTDGTF/REGTRY 
FTCALKKALFVKLKSCRPDSRFASLQPVSNQIERCNSLAIWEAY 
LSEWEENTPTQKWEKEGLEIMIG\KKKGIQGHYNSCYLDSTriF 
CL FAFSSVLDT VLLRPKEKNDVE Y YSE TQELtiRTE 1 VNPLR I YG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPE2 FLNILFHHILRV 
EPLIiKlRSAGQKVQDCYFYQI FMEKNE KVGVPTIQQLLEWS PIN 
SNL KFAEAPS CLI IQMPRFG KDFKLFK KI FPS LBLNITDLLEDT 
PRQCRICGGLAMYECRECYDDPD.TSAGKIKQFCKTCNTQVHLHP 
J<RIjNHKYNPVSIiPKDLPDWDWRHGCIPCQI^MELFAVLCIETSHY 
VAFVICYGKDDSaWLFFDSMADRDGGQNGFNIPQVTPCPEVGEYIi 
KMSLEDLHSLDSRRIQGCARPXLCDAIYVPCTQSPTMSI.YK' 


5973 


440 


1761 


illagspsprdqcsqrqssggdkelvtrgctfstawspsamtq 
epfreelaydrmptlergrqdpasyapdakpsdlqlskrlppcf 
shktwvfsvlmgs cllvtsgfslylgwfpaemdylrcaags ci 

PSAI VS FTVSRRNANVI PNFQILFVS TFAVTTTCLIWFGCKLVL 

NPSAININFNLI LLLLLELLMAATVI I AARSSEEDCKKKKGSMS 

DSANILDEVPFPARVLKSYSWBVIAGISAVLGGI1ALNVDDSV 

SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAI3SL 

TSPLLFTASGYLSFSIMRIVEMFKDYPPAIKPSYDVLLLLLLLV 

LLLQA/GPQHGHRHPVRAI4QGQCKAAGCILGHPERPAGAPGWGG 

GQEPPEGVRQGESLE3RRGANGPVTPRRGNRVAAPSLAPGMETH 
NP 


5974 


65 


2007 


NGDGKDLFGHiWAWKSflUl ISNFRRSPHAGMAEDEPDAKSPKTG - 
GKAPPGGAEAGEPTTLLQRLRGTI S KAVQNKVEG ILQD VQKFS D 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCLPXQSVYDAYRKYCESLACCPJPLSTANFGKIIRE I FPDI 
KARRLGGRGQS K YCYSGIRRJCTLVSMPPIiPGLDLKGSES PEMGP 
EVTPAPRDELVEAACALTCDWAERILKRS FSS I VEVARFLLQQH 
i*ISARSAHAHVLKAMGLAEEDEHAPRBRSSKPKNGLBNPEGGAM 
KKPERLAQPPKDLEARTGAGP LARGE RKKS WES S APGANNLQV 

nalvarlpli^praprslippipvsppilaprlssgalkvatlp 

LSSRAGAP PAAVP I INMI LPTVPALPGPGPGPGRAPPGGLTQPR 

gtenrbvgiggdqgphdkgvkrtaevpvseasgqappakaakqd 

IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRIi 

pwbtwgsggegnsaggaerpgpmgeaekgavlaqg\qgdgtvsk 

GGRGPGS 0HTKEAEDKI PL VPS KVSVI KGSR5QKEAFPLAKGE V 
DTAPQGNKDLKEHVLQSSLSQEHKDPKATPP 






4293 


2200 

( 
} 
I 
I 
I 

I 


LGbO^TTSGRXK0AMVTS3^NEPNESVTVEWIEfgGDTKGK\EID" 
LESIFSLNP\DL\VPDGEIEPSP\EfPPPPASSAKVNKIVKNRR 
TV\AS I KNDP P S \RBNRWGS ARARPS Q FPEQFSS AQQNGS V\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 
BKRAQDVDATNPNYB I MCMIRDFRGSLDYRPLTTADP IDEHRIC 
VCVR KRPLNKXETQMKDLDV I T I PSKD WWVHEP KQ KVDLTR YI> 
SNQTFRFD YAFDDS APNEMV YRFTARPLVET I FERGMATCFAYG 
2TGSGKTHTKGGDFSGKUQDCSKGI YALAARDVFLMLKKPNYKK 
LELQ VYATFFE I YSGKVFDIJiNRKTKLllVLEDGKQO^QVVGLQE 
IEVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQI ILRR 
CGKLHGKFSLIDLAGNERGADT3SADRQTRLEGAEINKSLLALK 
3CIRALGRKKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
IAS CENTIJnT*R YANRVTCELTVD PTAAX3DVRPIMHHPPNQI \DD 
iHTQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
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ID 
NO: 


Predicted, 
beginning 
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to first 
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residue of 
amino acid 
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rj-euicueci end 
nucleotide 
location 
c orr e eponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
(A=Alajiina, C-Cyeteine, D=Aspartic Acid, E= 
uxuusmxc ftcia, c»i , nenyj.aiaiixne / G=Glycine, 
HoHistidine, Iolsoleucine, K=Lysine, 
L»Leucine, M=Methionine, NeAsparagine, 
?=Proline, Q=Glutamine, R=Arginine, 
S^Serine, TVThreonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, XaUnknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








EEQWEDHRAVFQES IRWLEDEKALLBMTEEVDYDVDS YATQLE 
AILEQKI D I LTELRDKVKS FRAALQE BE QAS KQINPKRPRAL 


5975 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNES VTVEWIENGDTKGK\ E 1 ID 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV \ AS I KNDPPS \ RDNR WGS ARARPS Q FPEQFS S AQQNGSV \S 
DIS PVQAAKKEFGP PSRRKSNCVKEVBKLQEKREKRRLQQQELR 
EKRAQDVDATNPNYEIMCMIRDFRGSLDYRPLTTADPIDBHRIC 
VCVRKRPLNKKETQMKDLDVITIPSKDVVMVHEPKOKVDLTRYL 
ENQTFRFDYAFDDSAPNEMVYRFTARPLVEriFERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCS KG I YALAARDVFLMLKKPN YKK 
LELQVyATFFEIYSGKVFm,r.NRKTKLRVLEDGKQQVQWGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFS LI DLAGNERGADTSS ADRQTRLEGAE INKS LLALK 
E C I RALGRWKPHTPFRASKLTQ VLRDSFI GENSRTCMI ATIS PG 
MAS CBNTLNTLRYANRVKE LTVDPTAAGDVRp I MHHP PNQI \ DD 
LETQWGVGS S PQRDDL KLLCEQNEBEVS PQLFTFHEAVS QMVEM 
EEQWEDHRAVFQES I RWLEDEXALLEMTE EVDYD VDS YATQ LB 
AIIiEQKIDI LTELRDKVKS PRAALQEEEQASXQINPKRPRAL 


~ 5976 


20 


2949 


vhhlhltrvsvvvnldiilriaqqmgiktlnlvj^\lkraYlef~ 

PEVSWMEVKDPNMKGAMLTNTGKYAI PTIDA\EAYAIGKXEKPP 
PLPEEPSSS SEEDDPI PDELLCLI CKDIMTDAWTPCCGNS YCD 
E CI RTALL BSDEHTCPTCKQNDVS PDALIANKFLRQAVHNFKNE 
TGYTKRLRKQLPSP PP PI PPPRPLIQRNLQPLMRSPI SRQQDPL 
MIPVTSSSTHPAPSISSLTSNQSSLAPPVSGNPSSAPAPVPDIT 
ATVS IS VHS E KSDG P FRDS DNKI L PAAALAS EHS KGTS S I Al TA 
LMBEKGYQVPVLGTPSLLGQSLLHGQLJPTrGPVRINTARPGGG 
RPG WEHSN KLGYLVSP PQQIRRGERS CYRS INRGRHHSERSQRT 
CGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSV?PPGFPPAPANLSTPWVSSGVQTAHSNTIPTTQ 
APPLSREEFYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKKIQ 
KERRRS F SRSKSPYSGSSYSRS S YTY SKSRSGSTRSRS YSRS FS 
RSHSRSYSRSPPYPRRGRGKSRNYRSRSRSHGYHRSRSRSPPYR 
RYHSRSRSPQAFRGQS PNKRNVPQGBTEREYFNRYREVP PPYDM 
KAYYGRSVDFRDPFEKERYREWERKYREWYEKYYKGYAAGAQPR 
PSANRENFS PERFLPLNIRNS P FTRGRREDYVGGQSHRSRNIGS 
IWPEKL SAR DGKNQKDNTKS KE KESSNAPGDG KGNKHK KHRKRR 
KGEES EGFLNPELLETSRKSREPTGVESNKTDSLFVLPSRDDAT 
PVRDEPMDABS ITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
SKKBNIVKPAKG PQEKV1X3\DVRDIiI»DIjNL\QLKKPKEETPKI)L 
i ±iirmiiijKUKKMJCKi>ij \EPP\EKLTIjJnQQK\TPRNKTSQRGKSE 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSH FQCLSLRS INHILHPGAGVAAGPATGW/RE YLT 
P VLKESKFKK TGVI TPCTFVAAGDHLVHHCPTKQWATGEELKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAI IEEDDGDGG WV 
D7YKNTGITGITEAVKEITLENKDNIRLQDCSALCEBEEDEDEG 
EAADMEE Y EE SGLLETDEATLDXRKI VEACKAKTDAGGEDA I LQ 
TRT YDLYI TYDKYYQTPRLV7LFGYDEQRQPLTVEHMYEDI S QDH 
VKKTVTI ENHPHLPPPPMCSVHPCRHAEVMKKI IETVAEGGGEL 
GVHMYLL I FLKFVQAVI PTI E YDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQSPLTWAPGFYRRFDLATSGRRLRGQTAEPAGRQ 
RPRREPEAMDEQSVESIAEVFRCFICMEKLRDARLCPHC3KLCC 
FSCIRRWLTEQRAQCPHCRAPLQLRBLVNCRWABBVTQQLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCALWGGP1H 
GGHTFKPIAEI YEQHVTKVNEEVAKLRRRLMELIS LVQEVERNV 
EAVRimKDERWEIRNAVEMMIARLDTQLKNKLITLMGQKTSLT 
QETELLESLLQE VEHQLRSCS KS EL I SKSS E ILMMFQQ VHRKPM 
AS FVTTPVPPDFTSELVPSYDSATFVLENFSTLRQRAD PVYSPP 
LQVSGLCWRLKVYPDGNGVVRGYYLS VFLBLSAGLPETS KYEYR 
VEMVHQS CND PTKNI IRBFASDFEVGECWGYNRPFRLDLLANEG 
YLNPQNDTVILRFQVRSPTFFQKSRDQHWYITQLEAAQTSYIQO 
IKNLKERLTIELSRTQKSRDLSPPDNHLSPQNDDALBTRAKKSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


**v-j.v* Dcymuiii. LuiiLdining signs J. peptide 
lA^Alanine, CoCysteine, D=Aepartic Acid, E» 
Glutamic Acid, FnPhenyl alanine, G«Glycine, 
H=Histidine, I»Isoleucine, K=» Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=*Arginine, 
S=Serine, TsThxeonine, VoValine, 
W=Tryptophan, Y«Tyrosine, X=un known, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CSDMLUER \GPYSAS \VREAKEDEEDEEKIQNEDYHHELSDGDL 
DLDLVYEDEVNQLDGSSSSASSTATSNTEENDIDEETMSGENDV 
EYNNMELEEGELMEDAAAAGPAGSSHGYVGSSSR T <3R»tht re a 

ATSSLLDIDPLILIHLLDLKDRSS I BNLWGLQPRPPASLLQPTA 
S YSRKDKD QRKQQAJWRVPSDIiW^KRJUKTQMAS VRCMKTDVKN 
TLSEIKSSSAASGDMQ7SLFSADQAALAACGTENSGRLQDLGME 
LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRAL-HGSIGDILPKTE 
DRQCKALDSDAVWAVFSGLPAVEKRRKMVTLGANAKGGHLE3L 
QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECDTENEE 
C EEHTS VGG FHDSFM VMTQP PDEDTHSS FPDGE QIGPEDLS FNT 
DENSGR 


5979 


212 


3655 


LPDMTM YiiWbKLLAFGFAFLbTE VFVTGQS PTPSPTDAYLNASE 

TTTLS PSGSAVISTTTIATTPSKPTCDEKYAN ITVDYLYNKETK 

LFTAKLNVNENVEOTNNTCTNNEVHNLTECKNASVS IS HNS CTA 

PDKTLILDVPPGVEKVPVHCCS\QVEQPDSTIWLKWKNIETSTC 

DTQN I TYRFQCGNM I FDNKE I KLENLE PEHE YKCDS EIL YNSHK ' 

FTNASKII KTDFGSPGEPQII FCRSEAAHOGVITWNPPQRS FHN 

FTLCYIKETEKDCLNLDKNLIKYDLQNLKPYTKYVLSIiHAYirA 

KVQRWGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSKHVKCRPPR 

DRNG PHER YHLE VEAGNTLVRNESHKNCDFRVKDLQ YSTD YTFK 

AYFHNGDYPGEP FILHHSTSYNSKALIAFLAFLI IVTSIALLW 

LYKI YDLHKKRS CNLDEOQELVERDDE KQLMNVB P I HAD I LLET 

YKRKIADEGRLFIAEFQS IPRVFSKFPIKEARKPFNQNKNRYVD 

ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 

DETVDD7WRMIWEQKATVIVMVTRCEEGNRNKCAEYWPSMEEGT 

RAFGBCCCKDX,T}01KRCP\DYIIQKLNI\^KKBKATGREVTHIQ 

FTSWPDHGVPEDPHLLLKLRRRVNAFSNFFSGPIWHCSAGVGR 

TGTYIGIDAMLEGLEAENKVDVYGYVVKLRRQRCLWVQVEAQYI 

"^W'*" v c xx\ y r i a vniib iUtif YLHNM KKRDPPS E PS PLEAE 

FQRLPSYRSWRTQHIGKQE\ENKSKNRNSNVIPYDYNRVPLKHE 

LEMSKESEHDSI>ESSDDDSDSEEPSKYINASFIMSYWKP\EVMr 

AAQGPLKETIGDFWQMIFQRKVKVIVMLTELKHGDQEICAQYWG 

EGKQTYGDIEVDLKDTDKSSTYTLRVFELRHSKRKDSRTVYQYQ 

YTNWSVEQLPAEPKELISMIQVVKQKLPQKNSSEGNKHHKSTPli 

LIHCRDGSQQXGIFCALLNLLESAETEEWDIFQVVKALRKARP 

GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 

KVKQDAN CVNPLGAPEKLPEAKEQABGSEPTSGTEG PEHSVNGP 

ASPALNQGS 


5980 
5981 " 


3 


2363 


IlAWGCKIjRRLiRFT YGTQTRVSLALPGQYEL VHTIi VAHQGNWET I 

PEEDLBVQENNEDAAHDLTELEVTMHHALLQEVDVWAPCQGLR 

PTVDVLGDLVNDFLPVITYALHKDELSERDEQELOEIRKYFSFP 

VFFFKVPXLGSE 1 1 DS STRRMESERSPL YRQLIDLGYLS SSHWN 

CGAPGODTKAQSMLVSQSEKLRHLSTFSHQVLQTRLVDAAKALN 

L^CHCLDIFIKQAFDMQRDLQITPKRLEYTRKKENELYESLMN 

lANRKQEEMKDMIVETLNTMKEEXiLDDATNMEFKDVIVPENGBP 

VGTRE I KCCIRQ I QEt* IISRLNQAVANKL I S S VDYLRES FVGTL 

ERCLQSLEKSQDVSVHITSNYLKQILNAAYHVEVTFHSGS3VTR 

MLWEQIKQIIQRITWVSPPAITLEWKRKVAQEAIESLSASKLAK 

SICSQFRTRLNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 

DHAPRLARLSLESRSLQDVLLHRKPKLGQBLGRGQYGWYLCDN 

WGGHFPCALKSVVPPDEECHWNDLALEFHYMRSLPKHERLVDLKG 

S VI D YNYGGGS S IAVLLIMERLHRDLYTGLKAGLTLETRLQIAI* 

DVVEGIRFLHSQGLVHROIKLKNVLLDKQNRAKITDLGFCKPEA 

MMSGSIVGTPIHKAPELFTGKYDN3VDVYAFGILFWYICSGSVK 

LPEAFERCAS KDHLWNNVRRGARPBRLP VFDEECNQLMEACWDG 

D PLKRPLLGI VQPMLQG IMNRLCKS \NS BQPNRGLDDST 




1 


2S19 

< 


3RKHSAAMERPWGAADGLSRWPHGLGLLLLLQLLPPSTLSQDRL 

DAPPP PAAPLPRWSGP igvswglraaaa\ggafprggrwrrs ap 
3 \edeecgrvrdfvakuvnnthqhvfddlrgsvslswvgdstgv 
ilvlttfhvplvimtfgqsklyrsedygknfkditdlimntfir 
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I SEQ 
ID 
NO: 


Predicted 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ak * iU segment concaaning signal peptide 
<A«Alanine, Cysteine, D«Aspartic Acid, E=. 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H»Bistidine, I-Ieoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Prolin=, QsGlutaraine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
N=Tryptophan, ^Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\gpossibIe nucleotide insertion) 








TE FGMAI G PENS G KVVLTAE VSGGSRGGRI FRS S D FAKNF VQTD 
LP FH PLTQMMYS PQNSD YLLALS TENGL WS KNFGGKWEE IHKA 
VCLAKWGSDNTI FFTTYANGS CXADLGALEL WRTS DLGfCS FK7I 
GVKIYS FGLGGRPLFAS VMADKDTTRRIHVSTDQGDTWS MAQLP 
S VGQEQ FYS I IAANDDM VFMHVDEPGDTGFGTI FTSDDRG TVYS 
KSLDRHLYTTTGGErDFTNVTSLRGVYITSVLSEDNSlQTMITF 

MAPLSBPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EG PHYYT ILDSGG 1 1 VAIEHS SRP I NVI KFSTDEGQCWQTYTFT 
RDPIYFTGLASEPGARSMWISIWGFTESFLTSQWVSYTIDFKDI 
LERNCEEKDYTIWLAHSTDPEDYEDGCILGYKEQFIiRLRKSSVC 
QNGRDYWTKQPSICLCSLEDFLCDFGYYRPENDSKCVEQPELK 
GHDLEFCLYGREEHLTTO3YRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLS PEXQNSKS NS VP 1 1 LAI VGLML VT WAGVL IVKKY VC 
GGR PLVHL YS VLQQH \AEA\NG VDG VDALDTAS HTNKSGYHDDS 
DEDLLE 


5982 


56 


2316 


ATRPPRGSS WCRQFSRTASAAPGRSNMLRI PVRKALVGLSKSPK 
GCVRTTATAASNLIEVFVDGQSVMVEPGTTVLQACEKVGMOIPR 
FC YHE RhS VAGNCRMCL VE I E KA P K WAACAM P VM KG WNI LTNS 
EKSKKAREGVMBFLIANHPLDCPICDQGGECDLQDQSMMFGNDR 
SRFLEGKRAVEDKNIGPLVKTIMTRCIQCTRCIRFASEIAGVDD 
LGTTGRGNDMQVGTY3 BKMFMSELSGNI IDI CPVGALTSKPYAF 
TAR P WETR KTES IDVMDAVGSNI WSTRTGEVMR ILPRMHEDI N 
EE W I S DKTRFA YDGIiKRQRLTE PMVRNEKGLLT YTSWEDALSRV 
AGMLQS FQGKDVAAI AGGLVDAEALVALKDLLNRVDSDTLCTEB 
VFPTAGAGTDLRSNYLLNTTIAGVEEADVVLLVGTNPRFEAPLF 
NARIRKSWLHNDLKVALIGSPVDLTYTYDHLGDSPKILQDIASG 
SHPFSQVLKEAXKPMWLGSSALQRND3AAILAAVSS IAQKIRM 
TSGVTGIWiCVMNILHRIASQVAALDLGYKPGVEAIRKNPPKVLF 
LLGADGGC I TRQDLP KDCFI IYQGHHGDVGAP IAD VILPGAAYT 
EKSATYVNTEGRAQQTKVAVTPPGLAREDWKI IRAJOSEIAGMTL 
PYDTL\DQVRNRLBEVSPNLVRYDDIEG\ANYFQQANELSKLVW 

QQLrju^PLVPPQLXMKDFYMTDaiSRASQTMAKCVECAVTEGAQA 
VEEPSIC 


S983 
1 5984 


248 


1763 


EARGDGGRRRHRASGRRAGRGEP \AGLKSQGQRAVPKRAVaRGG 

rq\ysaaiallepagsbiadd:*silysnraacylkegncsgciq 

*-^xjv«xi_i_mrc omai'IjIjRRAMAYETLEQYGKAYVDYICTVLQIDC 
GLQLANDSVNRLSRILMELDGPNWREKLSL1PAVPA3VPLQAWH 

pakemisk^agdssshrc^itdektfi^keegnqcvndkkyk 

DA1^KYSECLKINKICECAIYTNFJVI,CYLKLCQFEEAK0DCI)QAL 
QLADGNVKAF YRRALAHKGLKNYOKSliT dtjik v tt.t nDQTTPsv 

MELEE VTRLLNLKDKTAPFNKEKERRKI E IQE VNEGKEEPGRPA 

gevstgclasekggkssrspedpeklpiakpnnayefgqiinal 

STRKDKEACAHL^ITAPKDLPMFLSNKLEGDTFLLLIQSLKNN 

LIEKDPSLVYQHLLYLSKAERFKMMLTLISKGQKELIEQLFEDL 
SDTPNNHFTLEDI QALKRQ YEL 


5985 


755 


1193 " " 


SSVCMACTYVSWWSKKQRSVSFU^GLMRVSTCPELRLHHSFVL 
TGDVGRRI CRLLVGLFTKGDTSS KRVHPFS PGPCFLLCDLAR VG 

SSPKINVSPFYQN\QTSTQRSCrVFVWQRCSLVGPFQVTVFTMY 
FHHSLRSrSRFSSG 




22 


1408 

) 
( 
I 


KKVARPGTAEPAKARRWRkGfeARRDLAGAERKAGVSERGDSGR 
RRPNPS IPSAAAGMSHIQlPPGLTBLLQGYrVEVLRQQPPDLVE 
FAVEYFTRliREARAPAS VLP AATPRQSLGH P PPE PG PDRVADAK 
GDS ES EEDEDI.E VPVPSRFNRRVS VCAETYNPDEEEEDTDPRVI 
KP KTDEQRCRLQEACKDILLFKNLDQEQLSQVLnAMFBR I VKAD 
BHVIDQGDDGDNFYVIERGTYDILVTKDNQTRSVGQYDNRGSFG 
E LALM YNTP RAAT I VATS EG S L WGLDR VTFRR 1 1 VKNNAKKR KM 
PES FIES VP LLKS LE VSERMKI VD VI GB K I YKR/DGER 1 1 TQGE 
fC\ADSFYII2SGEVSILIRSRTKSNKDGGNQBVEIARCHKGQYF 
3ELALVTNKPRAASAYAVGDVKCLVMDVQAFERLLGP CMD IK KR 
•JISHYEEQLVKMFGSSVDLGNLGQ 
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Amino acia segment containing signal peptide"" 
(A^Alanine, C=Cysteine, D=Aspartic Acid, tt« 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Histidine, I=>Isoleucine, K=Lysine, 
L=I*eucine, M«Methionine, N=Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=Tryptophan, YoTyroaine, x«Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


59BS 


ifio* " - 


484 


DAWKSTSLTPHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
SPCCRFDS PRGPP PPRLGLLGALMAEDGVRGS PPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGPGGQSGPEGERSLAPPDASI 
LISNVCS IGDHVAQEJ.FQGSDLGMAEEABRPGEK\AGQHSPLRB 
EHVTCV0S ILDEFLQT\ YGSLI PLSTDE WEKLBD I PQQEPSTP 
SRKGLVLQLlQSyQRMPGNAMVRGFRVAyKRHVLTMDDLGTLYG 
QNWLNDQ VMNM YGDLVMDT VP E K \ VHF FNS FFY \DKLRTKG YDG 
VKRWTKNVD I FNKELLL I P IHLE VHWSL I S VDVRRRTI TTFDSQ 
RTLNR RCP KH I AKYLQ AEAVK KDRLD FHQGWKG YFKMNVARQNN 

DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5987 


1806 


484 


DAWKS TSLTFHWKLWGRHRGRRRGLAHF KNHLS PQQGGATPQ VP 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEBD 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQELFQGSDLGMAEEAEKPGEK\AGQHSPLRE 
EHVTCVQSILDEFLQT\YGSLIPLSTDEWEKLEDIFQQEFSTP 
SRKGLVLQL I QSYQRM PGMAMVRGFR VAYKRHVLTMDDLGTLYG 
CN5fLNDQVMNM YGDL VMDT VPEK\ VHFFNS FF Y\ D KLRTKG YDG 
VKRWTKNVDIFNKELLLIPIHLEVHWSLISVDVRRRTITYFDSQ 
RTLNRRCPKH IAKYLQAEAVKKDRI*D?HQGWKGYFKMNVARQNN 
DS DCGAFVLQYCKHLALS QP FSFTQQDMPKLRRQI YKEL CHCKL 
TV 


5988 


" 1292 


410 


FKKYFLSFLGLLESSIISRDRIHNLVI^FIJIATHNLVWWFTCRFQ 
RLDCI YLNAG I MPNPQLNI KALLFGLFS \ AEGLLTQGD K I TADG 
LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 
FSLEDFQHS KGKE P YSS S K YATDLLS VALNRNFNQQGLYSNVAC 
PGTAL^NLTYGILPPFIWTLLMPAILLLRFFANAFTLTPYNGTE 
ALVWIiFHQKPE S LNPLI KYLS ATTCFGRNY IMTQKMDLDEDTAE 
KFYQKLLELE KH I RVTIQKTDNQARLS GS CL 


59B9 


194 


2610 


AMDFPQHSQHVLEQI^QQRQLGLLCDCTyWDGVHFKAHKA^TLA - 

ACSE YFKMIjFVDQKDWHLDI snaaglgqvlefmytaklsls pe 

NVDD VI* \AV7ATFIiQMQD 1 1 TACHAtiKS LAEPATS PGGNAEAI»AT 

eggdkrakebkvatstlsrleqagrstpigpsrdlkeerggqaq 
saasgaeqtekadaprepppvelkpdptsgmaaaeaeaalsess 
eqembveparkgeebqkegeeqeeegagpaevxeegsqlengea 
p eenenee s agtds gqelgs earglrsgtygdrteskaygs v ih 
kcedcgkefthtgnfkrhirihtgekpfscrecskafsdpaack 

AHBKTKSPLKPYGCEECGKSYRLISLLNLRKKRHSGEARYRCED 
CGKLFTTSGNLKRHQLVHSGEKPYQCDYCGRSFSDPTSKMRHLE 
THDTDKEHKCPHCDKKFNQVGNLKAHLKIHIADGPLKCRECGKQ 
FTrSGNLKRHLRIHSGEKPYVCIHCQRQFADPGALQRHVRIHTG 
EKPCQCVMCGKAFTQASSLIAHVRQHTGEKPYVCERCGKRFVQS 
SQIiANH I RHHDNI RPHKCS VCS KAFVNVGDLS KHI I IHTGBKP Y 
LCDKCGRGFNRVDNLRSIIVKTVHQGKAG IKILBPEEGSEVSWT 
VDDMVTIATEALAATAVTQLTVVP VGAAVTADBTEVLKAEIS KA 
VKQVQEEDPNTHILYACDSCGDKFIiDANSIiAQHVRIHTAQALVM 
FQTDADF YQQ YGPGGTW PAGQ VLQAG ELV FR PRDGAEG QPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


FGPG PDSGGGARGSG WGS RSQAP YGTLGAVSGGEQ VLLHEEAGD 
SGF VS LSRLG PS LRDKDLEMEELMLQDETLLGTMQS YMDASL I S 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKIiPSWRPPRSR 
PRWGQSPPPQQRSDGEEEEBVASFSGQ1LAGELDNCVSS I PDFP 
MHLACPEEBDKATAAEMAVPAAGDES ISSLSELVRAMHPYCLPK 
LTHIiASLEDELQEQPDDLTLPEGCWLEIVGQAATAGDDLEIPV 
VVRQVSPGPRPVLLDDSLETSSALQLLMPTLESETEAAVPXVTL 
CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPQNPPANAAP 
GSQRARKGRKKKSKEQPAACVEGYARRLRSSSRGQSTVGTEVTS 
QVDKLQ KQPQEELQKESGPLQGKGKPRAWARAWAAALENS SPKN 
LERSAGQSSPAKEGPLDLYPKLADTIQTNPIPTHLSLVDSAQAS 
PMPVDSVEADPTAVGPVLAGPVPVDPGLVDLASTSSELVEPLPA 
EPVLINPVLADSAAVDPAWPISDNLPPVDAVPSGPAPVDLALV 
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to first 
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amino acid 
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Predicted end 
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to first 
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residue of 
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sequence 


Amino acid segment containing signal peptide 
<A-Alanine, C«Cysteine, DnAspartic Acid, E» 
Glutamic Acid, F*Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L«Leucine, M*Methionine, NsAsparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V» Valine, 
W=Tryptophan, Y«Tyroaine, X«TJnknown, *«Stop 
Codon, /apoasible nucleotide deletion, 
\=possible nucleotide insertion) 








DP VPNDLTP VDP VLVKSRPTDPRRGAVSS ALGGSAPQLLVES E S 
LDPPKTIIPEVKEVVDSIiKIESGTSATTHBARPRPLSLSEYRRR 
RQQRQAETB KRS PQP PTGKWPSLP ETPTGLADI PCLVI PPAPAK 
KTALQRSPETPLEICLVPVGPSPASPSPEPPVSKPVASSPTEQV 
PSQEMPLIARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPPTHYAPLPSWPCYPHVSPSGYP 
CLP P ? PTVPLVSGTPGAYAVppTCSVPWAP P PAPVS P YS STCTY 
GP LG WGPGP QHAP FWST VP PPPLP PAS IGRAVPQPKMES RGTPA 
GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 
KKVSALVQS PQMKALACV SAEGVTVEEPASERLKPETQETRPR E 
KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 
DWQAFISE IGIEASDLSSLLEQ FEKS BAKKECPPPAPADSLAV 
GNSGG VD I PQEKRPLDRLQAPBLACJVAGLTPPATP PHQLWKPLA 
AVSLIAKAKSPKSTAQEGTLKPEGVT2AKirPAAVRLQEGVHGpS 
RVHVG33DHDYC\VRSRTPPKK\MPALLIPEVGSRWNVKRHQDI 
TI KPVLSLGPAAPPPPC IAASREPLDHRTSS EOADPSAP CLAPS 
SLLSPEASPO^DMNTRTPPBPSAKQRSMRCYRKACRSASPSSQ 

c-wqgr:«grnsrsvssgsnrtseasssssssssssrsrsrslspp 

HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PSPRRRSDRRRRYSSYRSHDHYQRQRVLQKERAIEERRWFIGK 
IPGRMTRSELKQRFSVFGBIEECT1HFRVQGDNYGFVTYRYAEE 
AFAAIESGHKLRQADBQPFDLCFGGRRQFCKRSYSDLDSNREDF 
D P AP VKSKFDS LDFDTLLKQ AQKNLRR 


5991 


334 


1379 


RLSSHFSQCSPSIYC\TKFDKQGNVTSFERKKTELYQELGLQAR 
DLRFQHVMSITVRNNRI IMRMEYLKAVITPECLLILDYRNLNLK 
QWLFR2LPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKL 
SILQPLILETLDALGDPKHSSVDRSKLHILLQIJGKSLSELBTDI 
. KI FKES ILBI LDEEELLEELCVS KWSDPQVFEKSSAGIDHAEEM 
ELLLEN YYRLADDLSNAARELRVL IDDSQS 1 1 FINLDSHRNVMM 
RLNLQLTMGTFSLSLFGLMGVAFGMNLESSLEEDHRIFWLITGI 
MFMGS GLI WR RLLS FTX5R/ LARS S IAS YGM KDMVHGGI VEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL " 
SGQDDEECX3LTAQDSQINL/SEVLDASSLSFNTRLKWFAI CFVC 
GVFFSI LGTGLLWLPGG I KLFAVF YTLGNLAALASTCFLMGPVK 
QLKKMFBATRLIATIVMLLCFIFTLCAALWWHKKGLAVLFCILQ 
FLSMTW YSLSYI P YARDAVI KCCSSLLS 


5993 


1650 


594 


AEGliGS WAWAGLG WAGRHMEAGGATGAIiGVG CKLPSAFC FPGS 
SVAMDMFQKVEKIGEGTYGVVYKAKNRETGQLVALKKIRLDLEM 
EGVPSTAIRE I SLLKELKHPNIVRLLD WHNBRKLYLVFEFLSQ 
DLKKYMDSTPGSELPLHL IKS YLFQLLQGVSFCHSHRVIHRDLK 
PQNLLINEI^AIKLADFGLARAR3VPLRTYTHEVVTLWYRAPEI 
LLATR F YTTAVD I WS IGC I FAEMVTRKALF PGDS \E IDQ\LFRI 
FRMLGTPSEDTWPGVTQLPDYKGSFPKWTRKGLEEIVPNLEPEG 

RDLLMOLLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH 


S994 


394 

• 


1934 


AGEVGLHVWIRGMRIQPQ/KAAAIIDLDPDFEPQSRPRSCTWPTT" 
PRPE IANQPS KPPE VEPDLGEKVHTEGRS EP ILLPSRLPE PAGG 
PQPG ILGAVTGPRXGGSRRNAWGNQS YAELISQAI ESAPE KRLT 
LAQ I YEWMVRTVP Y FKDKGD SNSS AGWKNS IRHNLSLHSKF IKV 
HNEATGXSSWWMLNP EGGKSGKAPRRRAASMDS S SKLLRGRS KA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMim' 
FRPRSSSNASSVSTRLSPLRPESEVLAE3IPASVSSYAGGVPPT 
ijOiUjljJi/ijLajyiiNL.TSSHSLLiS RSGLS G FS LQH PG VTGPLHTYSS 

SLFSPAEGPLSAGEGCFSSSQALEALLTSDTPPPPADVLMTQVD 
P1LSQAPTLLLLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SMIAPPPVMASAPIPKALGTPVLTPPTEAASQDRMPQDLDLDMy 
MENLECDMDNI ISDLMDEGEGLDFNFEPDP 


5995 " 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRLPGR 
GVAALRRGPGS APGLPRGRAERS AAG SGRGPSREERGAAAAAAA 
AEMMEELHSL\DP\RRQELLEARF\TGLGVSKGPLNSESSNQSL 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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nucleotide 
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— u\.^i»t.iiv. ^vii ii iiuu ^> XMiia peptide 
{A*Alanine, C«Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
HoHistidine, I-Ieoleucine, K= Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=TTyptophan, Y«Tyrosine, X=Unknown, *=»stop 
Codon, /^possible nucleotide deletion, 
\aposaible nucleotide insertion) 








I SDYFERRVEQPLYGLDGSAAKBATEEQSALPTLMS VMLAKPR™ 
DTEQLAQRGAGLCFTFVS AQQNS PSSTGSGNTEHS CS SQKQ I S I 
QHR2T\QSDLTI EKI SALENS KNSDLBKKEGRIDDLLRANCDLR 
RQI\DEQQKMLEKYK\ERLNRCFDNEPRNFLIEKSKQEKMACRD 
KSMQDRLRLGHFTTVRHGAS FTEQWTDG YAFQNLI KQQERINSQ 

REE ierqrkmlakrkppamgqappatneqkorksktngaenetl 

TLAEYHEQEEI FKLRLGHLKKEEAEIQAELBRLERVRNLHIREL 
KRIHNEDNSQFKOHPTLNDRYLLLHLLGRGGFSEVYKAFDLTEQ 
RYVAVKIHQLNKNWRDBKKENYHXHACREYRIHKELDHPRIVKL 
YDYFSLDTDSFCTVLBYCEGNDLDFYLKQHKLMSBKEARSIIMQ 

KIMDDDSYWSVDGMBLTSQGAGTYWYLPPBCFWGKEPPKISNK 
VD V WS VG V I F YQCL YGRKPFGHNQSQQD I LQENTI LKATEVQFP 
PKPWTPEAKAFIRRCLAYRKBDRIDVQQLACDPYLLPHIRKSV 
STSSPAGAAIASTSGASNNSSSN 


5996 


1612 - 


981 


LFS I WFGS I VNEG YLNS AS EGEE FCI YNRNPNACS YG VAVGVL 
AFLTCLLYLALDVY F PQ I SS VKDRXK\ A VLSGHP WSGE PHPAA 
FWAFLWFTGDS CYL \ANQWQVS KP KDNPLNEGTDAS PGRPS P FS 
FFSI FTWSLTAALAVRRFKDLSFQEEYSTLFP\ ASAQP 


5997 


1612 


981 


■'W**** * UDrvaxiiAri/ronluon Av-K/oWVSWRSRPGCE 
LFSI WFGS I VNEGYLNSASEGBEFCI YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVY FPQI SSVKDRKK\ AVLS GHPWS G EPHPAA 
FWAFLWFTGDSCYL\ANQWQVSKPKDNPLNEGTDASPGRPSPFS 
FFS 1 7TWSLTAALAVRRFKDLSFQEEYSTLFP\ASAQ P 


5998 


1612 


981 


DQQACLLGLMLTLBFG I LEFDPS WIGS WTUR /SWVS WRSRPGCE" 
LFS I WFGS I VNEGYLNSASEGEEFCI YNRN PNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDSCYL\ANQWQVSK PKDN PLN EGTDAS PGRP S PFS 
FFS I FTWSLTAALAVRR FKDLSFQEEYSTLPP\ASAQP 


5999 


2 


1790 


RP PMEKARRGGDG VPRG P VLH I VWG FHHKKGCQVEFS YP PLI P 
GDGHDSHTLPBEMKYLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFGI SCYR \Q IEAKALKVRQAD ITRETVQKS VC VLS KL PLYG 

LLQAKLQL1THAYFEEKDFSQISILKELYEHMNSSLGGASLEGS 
QVYLGLSPRDLVLHRRHJCGTjTT.'FKT.TT.T.P'VTriTT cvrcnTrxTTfT 

AL1MTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
F VSASTADVSHTNLGT I RKVMAGNHGEDAAMKTEE PLFQVEDSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLGSDQTNL F PKDS VPSES L P I TVQPQANTGQ WLI PGL I SGLE 
EDQYGMPLAI FTKG YLCL P YMALQQHHLLSDVTVRG FVAGATNI 
LFROQKHLSDAIVEVEEALIQIHDPELRKLLNPTTADLRFADYL 
VRHVTENKCDVFLDGTGWEGGDEW IRAQFAVY IHALLAATLQLV 
LFRIVNVAKKIGNVMVTT\SRNVVQTGK\AVGQSVGGAFS\SAK 
TA\MSSWLSTFTTSTSQSLTEPPDEKP 


6000 
6001 " 


101 


1561 


TEP CRTAEN CTATMS ENNKNSLESSLRQLKCH FTWNLMEG ENSL 
DDFEDKVFYRTEFQNREFKATMC^IAYLKHLKGQNEAALECt^ 
KABE L I QQ EHADQA5 I RSLVTWGNYAW VYYHMGRLS DVQ I YVDK 
VKHVCEKFSSPYRI3SPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYLKVLLALKLHKMREEGBEEGEGEK\L VEEALEKAPG \ VTD V 
LR6AA\ KFYRGKDE PDKAI BLLKKALE yi p \nnaylhcqigccy 
RAKVFQVMNLRBNGMYGKRKLLELIGHAVAHLKKADEANENLFR 
VCSILASLHALADQYEDAB YYFQKEFS KELTPVAKQLLHLRYGN 
FQLYQMKCEDKA1HHFIEGVKINQKSREKEKMKDKLQKIAKMRL 

SKNGADSEMHVLAFLQELNEKMQQADEDSERGLESGSLIPSAS 
SWNGB 




176 


1038 


AFAHS PSRGHRKTHIHTPRHTPRCTMAESHLQSSLI TASQFFE I 
MLHFDADGSGYLEGKELQNLIQBLQQARKKAGLELS PEMKTFVD 
Q YGQRDDGKIGI VELAHVLPTEENFLLLFRCQQLKS CE \EFMKT 
WRKYDTDH3GFI BTEELKNFLKDLLEKANKIVDDTKLAE YTDLM 
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3lQ~ 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment conbainina sianal rienH^ — I 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Bo 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Scrine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLFDSNNDG KLE LTEMAR £»£p VQEN FLLKFQG 1 KMCGKE FNKA 

PELYDQDGNGYlDENELDALLKDliCEKNKQDLDINNITTYKKNI 
MALSDGGKL YRTDLAL ILCAGDN 


6002 


977 


81 


LAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLHPHS 
SMAQRSDLLEIjDCQLTRDRVWVSHDENLCRQSGLWRDVGSLDP 
EDL PLYKBKLEVYFS PGHFAHGSDRRMVRL EDLFQR FPRTPMS V 
EIKGKNEELIREQ/VLVRRYDRNBITIWASEKSSVMKKCKAANP 
EMPLSFTISRGFWVLLSYYLGLLPFIPIPEKFFFCFLPNIINRT 
YFP?S CS CLNQ LLA WS KWL I MRKSL I RHLEB RGVQWFWCIjNE 
ES DFEAAFSVGATGVITDYPTALRHYLDNHGPAARTS 


5003 


140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEKSVDGNRP^SAASAFKVP 
APKTSGNPAN SARKPGSAGG P KVG AGASKEGG AGAVDEDDF I KA 
KTDVPS IQI YS SRELEETLNKIR E IL SDDKHD WDQRANALKKIR 
SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 
KLSTVLGNKFDHGABAI VPTLFNLVPNS AKVMATSGCAAIRFI I 
RHTH VP RLI PL ITSNCTS KS VPVRRRSFEFLDLLLQEWQTHSLE 
RHAAVLVETIKKGIHDADAEARVEARKTYMGlJlNHFPGEAETLY 
KSliEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPST VAGR VSAGSS KASS LPGSLQRSRS DIDVNAAAGAK 
AHHAAGQS VRSGRI/3AGALNAGS YASLEDTSDKLDGTAS EDGRV 
RAKLSAPIiAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 
TTALSTVSS GVQRVLVNSASAQKRSKIPRSQGCSREAS PSRLSV 
ARS S R I PRPS VSQGCS REASRES S RDTS P VRS FQ PLAS RHHSRS 
TGALYAPBVYGASGPGYGISQSSRLSSSVSAMRVLNTGSDVEEA 
VADALLLGD IRTKKKPARR RYES YGMHSDDDANSDASS ACS ERS 
SUNOS IPTYMRQT\EDV\ AEVLNRCASSmfSERFCEGLLGLQN 
LLKNQRTLSRVELKRLCE I FTRM FAD PHGKR V FSM FLETLVDFI 
QVHKDDLQDWL FVLLTQLL KKMGADLLGS VO > AKVQKALD VTRES 
FPNDLQFNILMRFTTOQTQTPSLKVKVAILKYIETLAKQMDPGD 
FINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
FTMLLGALPKTFQDGATKLLHNHLRNTGNGTQSS MGSP LTRPTP 
RS P AN WS S PLTSPTNTS ONTL Q P«3 AF nVTvr fvmm e ptyt voorir 
VTEAI QNFS FRSQEDMNEP LKRDS KKDDGDSMCGGPG\MSD PRA 
GGDATDSSQTAL\ DNKAS LLHSMPTHSS PRSRD YNPYN YSDS I S 
PFNKSALKEAMFDDDADQFPDDLSLDHSDLVAELLKEr^NHMER 
VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
IRALALKVLREILRHQPARFKMYAELTVMKTLEAHKDPHKEVVR 
SAEEAASV\ LATS I \SPEQCIKVLCPI IQTADYPINLAAIKMQT 
KVI E R VS KE TLNLLL PE IM PGL I QG YDNS ES S VR KACVFCL VAV 
HAVIGDELKPHLSQLTGSKMKLLNLY3 KRAQTG SGGAD PTTD VS 
GQS 


6004 


140 


4098 

J 
] 


OiOiRAFRGMRRilC^ICDYKSFDDEBSVDGNRPSSAASAFKVP" 
APKTS GNPANSARKPGS AGGPKVGAGAS K EGGAGAVDEDDFI JCA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQVVREACITVA 
HL STVLGNKFDHGAE A IVPTL FNL VPNS AKVMATS G CAAI R F T I 

RKTHVPRLIPLITSNCTSKSVPVRRRSFEFLDLIiLQ^QTHSLE 
RHAAVLVETIKKGIHDADAEARVEARKTYMGLRNHFPGSAETLY 
NSLEPSYQKSLQTYLXSSGSVASLPQSDRSSSSSQESLNRPFSS 
KKSTANPS TVAGRVS AGSSKASS LPGS LQRSRSDI D VNAAAGAK 
AHHAAGQS VRSGR LGAGALNAGS YASLBDTSDKLDGTAS EDGRV 
RAKLS APLAGMGNAKADSRGRS RTKMVSQSQPGSRSGS PGRVLT 
TTALSTVSS GVQRVLVNSASAQKRSKIPRSQGCSREAS PSRLSV 
ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGALYAP E VYGAS GPG YG I SQS S RLS 3 S VS AMRVLNTG SD VE EA 
VADALLLGD IRTKKKPARR RYE SYGMHSDDDANSDASSACSERS 
YS SRNGS IPT YMRQT\EDV\AE VLNRCASSNMSER KEGLLGLQN 
bLKNQRTLSR VE LKRLCE I FTRMFADPHGKRVFSMF LETLVDFI 
3\^IOJDIXJDWLFVIiLTQLLKKMGADLLGSVQAKVQKALDVTRES 
PPNDLQFNILMRFTVDQTQTPSLKViCVAI LKYI ETLAKQMDPGD 
PINSSSTRLAVSRVITWTTBPKSSDVRKAAQSVLISLFELNTPE 
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SEQ 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


v-unuaininy siynai pepCld© 

(A«Alanine, CoCysteine, D=Aspartic Acid, B- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Ieoleucine, K«Lysine, 
L=Leucine, M«Methionine, N»Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S»Serine, T=Threonlne, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *=Stop 
Codon, /opossible nucleotide deletion, 
\=possible nucleotide insertion) 








" FTMLU3ALP KTFQI^ATKttk^!LfeOTGNGTQS5MGS PLTRPTP~~ 
RSPANWSSPLTSPTNTSQNTLSPSAFDYDTENMNSEDIYSSLRG 
VTEAXQNFS PRSQEDMNEPLKRDSKKDDGDSMCX3GPG \MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSALKEAMPDDDADQFPDDLSLDHSDLVAELLKBLSNHKER 
VEERKIAL YELMKLTQEESFS VWDEHFKTILIrLLLETLGDKE PT 
IRALALKVLRE ILRHQPARFKNYAELT VM KTLEAH KDPHKBWR 
S AEEAAS V \ LATS 1 \S PEQCI KVLCP 1 1 QTAD YPI NLAAI KNQT 
KVI ERVS KETLNLLL P E I M PG L I QG YDNSES S VRKACVPCLVAV 
HAVIGDELKPHI»SQLTGS KMKIiLMXiYI KRAQTGSGGADPTTDVS 
GQS 


6005 


133 


5955 


RSSGRRQEQliGQFPGRBRKGMASGLG^ PS PCSAGS EEEDMDALL 
WNSLPPPHPENEEDPBEDLSETETPKLKKKKKPKKPRDPKIPKS 
KRQKKERMLLCSQLGDSSGEGPEFVEEEEEVALRSDSEGSDYTP 
GKKKFCK3CLGPKKEKKSKSKRKEEEEEDDDDDDDSKEPKSSAQLI, 
BDWGMEDIDHVFSEEDYRTLTNYKAPSQFVRPLIAAKMPKIAVS 
KMMMVLGAKWRS FSTNNP FKGSSGAS VAAAAAAAVAWES M VT A 
TEVAPPPP PVEVPIRKAKTKEGXGPNARRKPKGS PRVPDAKKPK 
PKKVAPLKI KLGGFGS KRKRSSS EDDDLDVESDFDDAS INSYS V 
SKJSTSRSSRSRKKLRTTKKKKKGEEEVTAVDGYETDHQDYCEV 
CQQGGEI ILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEG I 
QWBAKEONS EGEE ILBE VGGDLBBEDDHHME FCRVCKDGGELL C 
CDTCPSSYHIHCLNPPLPBIPNGEWLCPRCTCPAUCGKVQiaLI 
WKWGQPPSPTPVPRPPDADPNTPSPKPLEGRPERQFFVKWQGMS 
YWHCSWVSELQLELHC\QVMFRNYQRKNDMDEPPSGDFGGDEEK 
S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMIHRILNHSVDKKG 
HVHYLI KWRDLPYDQAS WESEDVE IQDYDLFKQS YWNHRELMRG 
EBGRPGKKLKKVKLRKItBRPPETPTVDPTVKYERQPEYLDATGG 
TLH P YQMEGLNWLRFS WAQGTDTILADEMGLG KTVQTAVFL YS L 
YKEGHSKGPFI*VSAPLSTIIN\WEREFEMWAPDMYV\VTYVGDK 
DSRAI I REKEFS \FEDNAIRGGKKASRMKKEAS VKFHVLLTSYE 
LI T I DMAILGS I DWACL I VDEAHRL KKNQS KFFR VLNGYSLQHK 

llltg rpiiqnnleelfkllnfltper fhnlegflsefad iaked 
qiiqclhdmlg\phmi^rlkadvficnmpskteliv\rvelspm\q 
kky yk\ yi lhs kfl kaln\ argggnqvsllnvvmdlkkccnhp y 
lfpvaameapkmpngwdgsalirasgkllllqkmlknlkeggh 

RVLIFSQMTKMLDLLEDFIiBHEGYKYERIDGGITGNMRQEAIDR 
FNAPGAQQFCFLLSTRAGGLGINLATADTVI I YDSDWNPHNDIQ 
AFS RAHR I GQNKKVM I YR FVTRAS VEE R I TQVAKJCKMMLTHL W 
RPGLGSKTGSMSKQELDDILKFGTEELFKDEATDGGGDNKEGED 
SSVIHYDDKAIERLLDRNQDETEDTELQGMNEYLSSFKVAQyw 
REBEMGEB EEVERE 1 1 KQEES VD PDYW E KLLRHHYEQQQEDLAR 
NI/5KGERIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEBGDE 
DFDERSEAPRRPSRKGLRNDKDKPLPPLLARVGGNI3VLGFNAR 
QRKAFLNAIMRYGMPPQDAFTTQWLVRDLRGKSEKEFKAYVSLF 
MRHLCEPGADGAETFADG VPRBGLSRQHVLTR IGVMSLIR KKVQ 
EFEHVNGRWSMPEIiAEVEENKKMSQPGSPSPKTPTPSTPGDTQP 
NTP AP VP PAEDG I KI BENS L KE EES I EGEKEVKSTAPETA X ECT 
Q APAPASEDE KWVE P PEGEEKVEKAE VKERTEE PMETE PKGKG 
AADVBKVEEKSAIDLTPIWEDKEBKKEEEEKKBVMLQNGETPK 
DLNDEKQKKNIKQRFMFNIADGGFTELHSLWOJ^EERAATVTKKT 
YEI WHRRHDYWLLAG I INHG YARWQD IQNDPRYAI LN3 P FKGEM 
NRGNFLEIKNKFLARRFKLLEQALVIEEQLRRAAYLNMSEDPSH 
PSMALNTR F AEVE CLAESHQHLS KESMAGNKPANAVLHKVLKQL 
EELLSDM KAD VTPXPATI AR I PPVAVRLQMSERNILSRLANRAP 
EPTPQQVAQQQ 


6006 


1 


965 


DNDFLRNTVHRHBPPVTABPIRLLAENEDVWVDKPSSrPVHPC " 
GRFRHNTVT FILGK^HQLKELHPLHRLDRLTSGVIiMFAKTAAVS 
EJUHEQVRDRQLEKEYVCEVEGEFPTEEVTCKEPILVVSYKVGV 
CR VDPRGfCPCETVFQRLS YNGQS S WRCR PLTGRTHQ IRVHLQF 1 
LGHPI LND P I YNSVAWG PSRGRGGYTPKTNEELLRDL VAEHQAK 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid. B= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
n-nxbLiame, i^isoxeucme, K=I»ysme, 
L»Leucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T»Threonine, V«Valine, 
W«Tryptophan, Y-Tyrosine, X-UnJcnown, *=*Stcp 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 








QSLDVLDLCEGDLSPGLTDSTAPSSBLGKDDIiEELAAAA\QKMfi~ 

BVAEAAPQEIiDriALASKJCAVETDVMNQ\RQT\TLCRVPAGATG 

SIiAPRPCDVPTCPTL 


5007 


3 


2351 


HELGQVEYVFTDKTGTLTENEMQFRECSINGMKYQEINGRLVPE 
GPTPDSSEGMLSYLSSLSHIiNNLSHLTTSSSFRTSPENBTEblK 
BHDLFFKAVSLCHTVQXNNVQTDCTGDGPWQSNLAPSQLEYYAS 
SPDEKALVBAAARIG1VFIGNSEETMEVKTLGKLERYKLL11ILE 
PDS DRR RMS VI VQAPSGEKLLFAKG AESS ILPKCIGGB IEKTR I 
HVDEFALKGLRTLCIAYRKFTSKEYEEIDKRrFEARTALQQR\E 
E KLAAV FQF I EJCDLI LLG ATAVEDRLQDKVRET I EALRMAG I KV 
W VLTGD KHBTAVS VS LS CGHFHRTMNILBL I NQ KS DS E CAEQLR 
QLARRITEDHVIQHGLVVDGTSLSLALREHEKLFMBVCRNCSAV 
LCCRMAPLQKAKVIRLIKISPEKPITLAVGDGANDVSMIQEAHV 
GIGIMGKBGRQAARNSDYAIARFKFLSKLLFVHGHFYYlRIATIi 
VQYFFYKNVCFI TPQFLYQFYCLFSQQTLYDSV YLTLY\NI CFT 
SLPILIYSLLEQHVDPttVLQNKPTLYRDISKNRLLSIKTFLYWT 
ILG FS HAFI FFFGS YLL I GKDTS IiLGNGCjMFGNWTFGTL VFT VM 
VITVTVKMALETHFWTWINHLVTWGSIIFYFVFSLPYGGILWPF 
LGSQNM YFVF I QLLSSGS AWFA I ILM WTCLFLD I IKKVFD RHL 
HPTSTBKAQLTETNAGIKCLDSMCCFPEGEAACASVGRMLERVI 
GRCSPTHrSRSWSASDPFYTNDRSILTLSTMDSSTC 


5008 


4554 


1089 


A3VRRAGARRG PGRALP AGAT AVPP P SARRRRRCPAPEHAG PAR 
ASRPSQETMFQLPVNNLGSLRKARKTVXKILSDI GLE YCKEHI E 
DPKQ FE PNDFYLXNTTWEDVGLWDPS LTKNQD YRTKP FCCS AC P 
FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTFKADKKTLETH 
IKIFHAPNASAPSSSLSTFKDKNKNDGLKPKQADSVEQAVYYCK 
KCTYRDPLYE IVRKHI YREHFQHVAAPYI AKAGEKS LNGAVPLG 
SMAREESS I HCKRCLFMP KS YEALVQHVI EDHBR IG YQVTAMI G 
HTNVWPRS KPLML I APKPQDKKSMGLP PRIGS LAS GNV\RS L P 
S QQMVNRLS I PKPNLN STG VNMMS S VHLQQNN YGVKS VGQG Y S V 
GQSMRLGLGGNAPVSIPQQSQSVKQLLPSGNGRSYGLGSEQRSQ 
APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAA 
ATG PPPGNTSSTQ KWKICT I CNELFPBNVYSVH FE KEHKAB KVP 
AYANYIMKn^TSKCLYOJRYLPTDTLLNHMLIHGLSCPYCRS 
TFNDVEKMAAHMRMVHIDEEMGPKTDSTLSFDLTLQQGSHTNIH 
LLVTTYNLRDAPAESVAYHAQNNPPVPPKPQPKVQEKADIPVKS 
SPQAAVPYKKDVGKTIiCPLCFSILKGPISDALAHHLRERHQVIQ 
TVH PVBKK1»TY1CC IHCLGVYTSNMTAST ITLHLVHCRGVGKTQN 
GQD KTNAPS R LNQS PSLAPVKRTYEQMEFP LLKKR KLDDDSDS P 
S FFEEKPEEP WLALDPKGH \ EDDS YEARKS FLTKYFT\ KQPYP 
TRRBIEKLAASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 
LGFNMKELNKVKHEMDFDAEGLFENHDEKDSRVNASKTADfOCLN 
LGKEDDSSSDSFENLEE3SMESGSPFDPVFEVBPKISNDNPEBH 
VLKVI PEDAS ESEEKLDQKBDGSKYETIHLTEBPTKLMHNASDS 
EVDQDDWEWKDGASPSESGPGSQQVSDFEDNTCEMKPGTWSDB 
S S QSEDAR SSK PAAKKKATMQGDR2QLKW KNSS YGKVEG FWS KD 
QSQWXNASENDERLSNPQI EWQNSTIDS EDGEQFDNMTDGVAEP 
MHGSLAGVKLSSQQA 


6009 


4272 


1534 


CHGLQHLTPFRELN1jSLQG*EPH*AA*QAVRSEEKSIC*GSPSC " 
H LVLGVLVP VARQSSHS AG PAQ S AFR * TG TG S GTPKAAE QS G YW 

EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGEQASQRRTVFTAGGGECLGAKSVRASVFTGKQPGVMGLL 
NGKRGGCFESGYLFGF1VIGKIQSLBAKVPLPVNGQTGERASPG 
NCR IHI VDAVC* SEHH* DHFLAAAFLENSTI IS* VAPGSWQDHA 
VLQ KEV QAS VRCRGFBS VDTAPAGFWAHS P PGLQGEPTTTSVSL 
FVLAPQDGEGVPFVEGQLVTVLGLWPQSI RHTFVHHTQLFLHP 
I * KLGALD VAFLHLLTLVCS S FNVAYG *GKNGGTTLHQL FAEVN 
AVTRGSAVQRRPSITISSIHVDTKIQQELHDVMVAGADGWQWG 
DPFWGLAGIFHI*IDDPLHQIELSFQRRV*EQCQGVKPDSQPVP 
RPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDLLRGGDRGHWVIVLCRLGSLVGGLGTDELLWFGGR*^!! IG 
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SKQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rrcuictca en a 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D~Aspartic Acid, Ba 
Glutamic Acid, F- Phenyl alanine , G=Glycine, 
II=Hiatidine, I=Isoleucine, K=*Lysine, 
LaLeucine, M=Methionine, N=*Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
ScSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *«Stop 
<.wuv««, /B^BbAuic nucieociQs Qeietion, 
\-possible nucleotide insertion) 








I**RGRLSGEWGCGLGRGELFQVSIGIGVSIVHIGQGDlteVLGG'~ 
AGLVERGALHATGQGVEALVQQLLDVGPAGALGLCDGAALPC2GP 
GRVGQL PAEGLQVCI TLVAQWRMHDGRELGGAEWPWQALHGAAI 
CGVGGAILLKALSQY FLKGG *RLWCARGQ* P VKKRQRRWRG *TR 
R *NCLTIHCFN * 1*1 * GAVCCRLVI LRWCGLIiEVHG VYGT * IHCL 
GSFPGRLWP* P PISQERPNGHCQWEFRLAVPSWKCRWSRWRVRG 
TWR YGNPLLNLL * GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LPPFQGACRPRTQRCRTWCPIAWRQIiLAYTRD 


6010 


1 


3533 


IMPCGSSRIiIiRGCWTHPNEPVSDLSYFDCIESVMENSKVLGBSM """ 
AGISQNAKTGDLPAFGECVGIAS KAL CGLTEAAAQAAYL VGI FD 
PNSQAGHQGLVDPIQFARANQAIQMACQNLVDPGSSPSQVLSAA 
T IVAKH TS A ti CNACR IAS S KTANP VAXRH F VQS AKEVAN STANL 
VKTI KALDGDFSEDNRNKCRIATAPL IEAVENLTAFASNPEFVS 
I PAQ I SSEGSQAQB P I LVSAKPML E S SS YLIRTARSIA INPKDP 
PTWSVLAGHSH-rVSDSlKSLITSIRDKAPGQRBCDYSIDGINRC 
IRDI EQAS IAAVS QSLATRDDIS VEALQEQLTS WQEI GHLIDP 
IATAARGEAAQLGHKGTQLASYFBPLItiAAVGVASKILDHQQQM 
TVLDQTKTIJ\ESAL(MLYAAKEGGGNPKAQHTHDAITEAAQLMK 
E AVD D I MVT I >NF. A ASEVGL VGGMVDA I AEAMS KLDEGTPPE P KG 
TFVDYQTTWKySKAIAVTAQEMMTKSVTNPBELGGtiASQMTSD 
YGHLAFQGQMAAATAEP EE IG FQ I RTRVQDLGHGCI FLVQKAG\ 
ALQVCPTDS YTKRELI E CARAVTBKVSLVLS ALQAGNKGTQACI 
TAATAVSGIIADLDTTIMFATAGTLNAENSETFADHRENILKTA 
KALVEDTKLLVSGAAS T PD KLAQAAQ S SAATI TQLAEWKLGAA 
SLGSDDPETQWLINA1 KDVAKALSDL I SATKGAAS KPVDDPSM 
YQLKGAAXVIWTNVTSLLKTV'KA VED EATRGTRAL BATI ECIKQ 
ELTVFQS XDVPEKTSS P EES IRMTKG I TMATAKAVAASNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRAXiRFGTEC 
TLG YLD L L EHVLV I K PTP E L KQQLAAFS KRVAGAVTEL I Q AA 
EAMKGTEWVDPBDP TVIAETELLGAAAS IEAAAKKLBQLKPRAK 
PKQADETLDFBEQILEAAKSIAAATSALVKSASAAQRELVAQGK 
VGSIPANAADDGQWSQGLISAARMVAAATSSLCEAANASVQGHA 
S EEKLISSAKQVAASTAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDVWKTKFVGG3AQIIAAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELRBDBG 


6011 


446 


1835 


LLQP AMRKS PGLS D CL WAW I LI*LST LTGR S YGQP SLQDELKDNT 
TVFTR I LDRLLDG YDNRLR PGLGERVTE V KTDI FVTS PGPVS DH 
DMEYTIDVFFRQSWKDERLKFKGPMTVLRLNNLMASKIWTPDTF 
FHNOKKSVAHNMTMPNKLLRITSDGTLLYTMRLTVR\AECPMAF 
G RDFPM\ D\ AHACPLKFGS YAYTRAEVVY3 WTRE PARS VWAED 
GSRL^YDLLGQTVDSGIVQSSTGEYVVMTTHPHLKRKIGYFVI 
QTYLP CI MT V I LSQVS FWLNRES VPAR7 VFGVTTVLTMTTL SIS 
ARNSLPEVAYATAMDWFIAVCYAFVFSAL IEFATVNYFTKRGYA 
WDGKSWPEKPKKVKDPLIKKNNTYAPTATSYTPNIARGDPGLA 
TIAKSATIEPKBVKPETKPPBPKKTFNSVSKIDRLSRIAFPLLF 
GI FNLVYWATYLNRE PQLKAP TPHQ 


6012 


351 


5013 


PAELFQSFAIWHKELYDWRLGPWNQCQPVlSKSLEKPLECIKGE"" 
EGIQVREIACIQKDKDI PAEDI rCEYFEPKPLLEQACLI PCQQD 
CIVSEFSAWSBCSKTCGSGLQHRTRHWAPPQFGGSGCPNLTEF 
Q VCQS 5 PCEAEELR YS LHVG P WSTCSMPHS RQVRQARRRGKNKE 
REKDRSKGVKDPEARELIKKKRIJRNRQNRQENKYWDIQIGYQTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEWSEWS 
PCSOCHDMVS PAGTRVRTRTIRQFPIGS EKECPE FEEKEPCLS 
QGDGWPCATYGWRTTEWTBCRVDPLIiSQQDXRRGNQTAIjCGGG 
IQTRE VYCVQANEl^LSQLSTHKNKEASKPMDLKLCTGP I PNTT 
QLCH I PCPTECEVS PWS AWG PCT YBNCND QQGKKGFKLRJCRRI T 
NEPTGGSGVTGNCPHLI*EAIPCEBPACYDWKAVRLGDCEPDNGK 
BCGPGTQVQEWCINSDGBEVDRQLCRDAIFPI PVACDAPCPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQIRARS ILAYAGEEGGIRCP 
NSSALQEVRS CNCTPCTVYHWQTGPWGQCIEDTSVSSFNTTTTW 
NGEAS CS VGM QTRKVI CVRVNVGQVGPKKCPES LRPBTVRPCLL | 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ov, * u »s=y uicjjt. containing signal peptide 
(A*=Alanine, C= Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G«Glycine, 
H«Histidine, I-Ieoleucine, K= Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine, v=valine. 
W=Tryptophan, Y=Tyrosine, XoUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PCKKDCI VTP YSDWTSCPS \SCI05GDSS IRKQSRHRVI IQLPAN 
GGRDCTDP LYEEKACEAPQACQS YRW\ KTH KW \ HRCQ \ LVP\ WS 
VQQDS P\GAQEGCG PGRQARAI TCRKQDGGQAG IHECLQ YAGP V 
P ALTQACQ I PCQDDCQLTS WSKPS S CNGDCGAVRTRKRTLVGKS 
KKKE KCKNSHLYPL IETQYCPCDKYNAQP VGNWSDCILPEGKVE 
VLLGMKVQGD I KECGQG YRYQAMACYDQNGRLVETSRCNS HGY I 
EEACIIPCPSDCKLSEWS^SRCSKSCGSGVKVRSKWLREKPYN 
GGRPCPKLDHVNQAQVYEWPCHSDCNQYLWVTEPWSICKVTPV 
NMRENCGEGVQTRKVRCMQNTADGPSEHVEDYLCDPBEMPLGSR 
VCKLPCPEDCVISEWGPWTQCVLPCNQSSFRQRSADPIRQPADE 
GRSCPNAVEKEPCNLNKNCYHYDYNVTDWSTCQL5BKAVCGNGI ' 
tvi ai'Ujl^- v Ad u fto vubM t_h»A u\j hh rvN WQPxNTSCMVECP vNCQ 
LSDWS PWSECS QTCGL?GKMIRRRTVTQPFQGDGRPCPSLMDQS 
KPCP VKPCYR WQYGQ WS PCQ VQEAQCGEGTRTRNI S CWS DGS A 
DDFSKWBBEFCADIBLIIDGNKNMVLEESCSQPCPGDCYLKDW 

C C?TJC r.fTlT .TCXTViri T?DT "CfT* T ATfD O D 7T trwr c\i rvtrr ^mi/MIT 
j>onou^viii \*vri\je*UL>\jr \jljJ.UYK5»KrVj[ IQEIiENQHLCPEQML 

ETKSCYDGQCYEYKWMASAWKGSSRTVWCQRSDGINVTGGCLVM 
SQPDADRSCNPPCSQPHSYCSBTKTCHCBBGYTBVMSSNSTLEQ 
CTLI PVWLPTMEDKRGDVKTSRAVH PTQPSSNPAGRGRTWFLQ 
P FGPDG RLKTWVYGVAAG A PVLLI F I VSM I YLACKKP KKPQRRQ 
NNRLKPLTLAYDGDADM 


6013 


1161 


710 


GAF1AG VPVQPVLIRYPNS LDTTSWAWRGPG VLKVLWLTASQPC ~" 
S I VDVEFLPVYHPSPEESRDPTLYANNVQRVMAQALG1 PATECE 
F VGS LPVI WGRLKVALEPQL/WGTGKSAS EGWAVRKLCGRWGR 
ARPESNDQPGRVCQAATAIi 


6014 


2857 


613 


eavaggmeksrmnlpkgpdtj^fdkdefmkedfdvdhfvsdcrk 

RVQ LEELR DDLE LYYKLLKTAMVBL INKD YADF\ VNLS TNLVGM 
DKALNQLS VPLGQLREEVLSLRSS VSEGIRAVDBRMS KQEDI RK 
KKMCVLRLIQVTRS VEKI E KILNS QSS KETS ALEASS PLLTGQI 
LER IATBFNQLQFHACQS K \GMPLLDKVR PRI AG ITAMLQQSLE 
GLLLEG LQTSDVD 1 1 RHCLRTYATI DKTRDAEALVGQVLVKP Y I 
DEVIIEQFVESHPNGLQVMYNKLLEFVPHHCRLLREVTGGAISS 
EKGNTVPGYDFLVNSVWPQIVQGLEEKLPSLFNPGNPDAFHEKY 

RFREIAGSLEAALTDVIiEDAPAESPYCLLASHRTWSSLRRCWSD 
EMFLPLLVHRLWRLHSGRFWARYSVFV\N\BLSLRPZSNESPKE 
IKKPLVTCSKEPSITQGNTEDQGSGPSETKPWSISRTQliVYW 
ADLDKLQEQLPELLEIIKPKLEMIGFKNFSSISAALEDSQSSF3 
ACVPSLSSKI IQDLSDSCFGFLKSALEVPRLYRRTNKEVPTTAS 
SYVDSALKPLFQLQSGHKDKLKQA2 1 QQWLEGTLS ESTHKYYET 
VSDVLNS VKKMEESLKRliKQARKTTPANP VGP SGGMSODDKI RL 
QLALDVEYLGEQIQKLGLQASDIKSFSALAELVAAAKDQAYAKQ 
P 


~6015 
> 


13 


2237 


AEGCAERRGTEP WELSMS WE SGAG PGLGSQGMDLVWSAWVGKC 
VKGKG SIjPLSAHG I WAWLS RAEWDQVTVYLF CDDHKLQRYALN 
RITVVJR3RSGNELPLAVASTADL1RCICLLDVTGGLGTDELRLLY 
GMALVRFVNLISBRKTKFAKVPLKCLAQEVNI PDWI VDLRHELT 
HKKMPHIOTCRRGCYFVLDWLQKTYWCRQLENSLRETWELBEFR 
EGI EEEOQEEDKNIWDD ITEQKP BPQDDGKS TE S DVKADGDS K 
GSEEVDSHCKKALSHKELYERARELLVS YEEEQFTVLEKFRYLP 
KAIKAWNNPSPRVECVLAELICGVTCT'MPPAVT nAFT.rnvnr tttvp 

FEQLAALQIEYEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLSELPALGISGlRprYILRWTVELIVANTKTGRNARRF 
SAGQWBARRGWRLFNCSASLDWPRMVESCLGSPCWASPQLLRII 
F\KAMGQGLQDE\EQEKLLRICSIYTQSGENSLVQEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEBQGSVNDVKBEEKE 
EKEVLPDQVEEEEEtfDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QESPTAENARLLAQKRGALQGSAMQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVI FWTKPVL\EQRLEPSTCK\TDTLGL 
\SCfiVGS\GNCSNSSSSNFRGAFriLEARGSLHNGL\KTGLQLF 


6016. 


13 


2237 


ftEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 



427 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptic(e~~ 
(A=JUanine, OCysteine, D^Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
HaHistidine, Ialsoleucine, K-Lysine, 
I*=Leucine, N=Methionine , N«Asparagine, 
P= Proline, Q*Glut amine, R=Arginine, 
S-Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovra, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








VKGKGSLPJbSAHGIWAWLSRAEWDQVTVVLFCDDHKLQRYAIiN 
RITVWRSRSGNELPIJVVASTADLIRCKLLDVTGGLGTDELRLLy 
GMALVRPVNLISERKTKFAKVP3jKCLAQEVNIPDWIVDLRHELT 
HKXMPKINDCRRGCYFVLD WLQKT YWCRQLENS LRETWELEEFR 
EG I EBEDQEEDKNI WDDI TEQFCPE PQDDGKSTE SD VKADGDS K 
GSEEVDSHCKKALSHKELYBRARELLVSYEEEQFTVLEKFRYLP 
KAI KAKNWPS PRVE C VLAEL KGVTCENR EAVLDAFLDDGFLVPT 
FEQLAALQIEYEE2m5r.NDVLVPKPFSQFWQ?LLRGLHSQNFTQ 
ALLERMLSELPALGISGIRPTYILRWTVELIVANTKTGRNARRF 
S AGQ WEARRG WRLFNCS ASLDW P RMVES CLGS PCWAS PQLLR 1 1 
F\ KAMGQG LQDE \BQEKLLRI CS I YTQSGENSLVOEGSEAS P IG 
KSPYTLDSLYWSVKPASSSFGSEAKACQQEEQGSVNDVKBEEKE 
EKEVLPDQVEEEEBNDDQEEBEEDEDDEDDEEEDRMEVGPFSTG 
QESPTAENARLLAQKRGALQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\KTGLQLF 


6017 


203 


3469 


SHQE I EQNS AMAPRKRGGRGIS F I FCCFRNNDHPE I TYRLRNDS ' 
NPALQTMEPALPMPPVEELDVMFSELVDELDLTDKHREAMFALP 
AEKKWQIYCSKKIODQEEX*KGATSWPEFYIE)0IiNSMAARKSLliAL 
EKEEEBERS KT IESLKTALRTKPMRFVTR F I DLDGLSCI LNFLK 
TMDYETS ES R I HTS IiIG CI KALMNNSCjGRAH VLAHSBS I NVI AQ 
S LS T EMI KT KVAVLE I LG AVCLVPGGHKKVLQAM LHYQ KYASE R 
TRFQTLINDLDXSTGR YRDE VS LKTA I MS F I NAVLSQGAGVESL 
DFRLHLRYE \ FLMLGIHP VMDKLRKHENSTLDRHLDFFEMLRNE 
DELEFAKRFELVHlDTKSATQKFBIiTRKRLTHSEAYPHFMSILH 
HCLQMPYKRSGNTVQYWLLLDRI 1QQ1VIQNDKGQDPDSTPLEN 
FNI KNWRM LVNENEVKQWKEQAE KMRKEHNELQQ KLEKKEREC 
DAKTQEJCEEMMQTliNKMKEKIiE KETTEHKQVKQQVAELTAQLHE 
LSRRAVCASIPGGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGM 
LPPPPPPLFPGGPPPPPGPPPLGAIMPPPGAPMGLALKKKSIPQ 
PTNALKSFNWSKLPENKLEGTVWTEIDDTKVFKirjDLEDLERTF 
SAYQRQQDFFVNSNSKOKEADAIDDTLSSKLJCVTfRT «3Vinrot>A 
QNCKILLSRLKLSNDE I KRAI LTMDEQEDLPKDMLEQLLKFVPE 
KSDIDLLEEHKBELDRMAKADRFLFEMSRINHYQQRLQSLYFKK 
KFAERVAEVKPKVEAIRSGSBEVFRSGALKOliLBVVIAFGNYMN 
KGQRGNAYGF K I SS LNK I ADTKSS I DKNI TLLKYIj ITI VENKYP 
S VLNLNE ELRD I PQAAKVNMTELDKE ISTLRSGLKAVETELE YQ 
KSQP PQPGDKP VS WSQ PI T VAS FS FSDVEDLLAEAKDLFTKAV 
KHFGEEAGKIQPDEFFGIFDQFLQAVSEAKQENENMRKKKEEBE 
RRARMEAQLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLSKLKRNRKRITNQMTDSSRERPITKLNP 


6018 


13 
2 


2510 

■ 


TISQSGGIRRRREAVWFEVVNMDFSRIiHMYSPPQCVPENTGYTy 
ALS3 S YS SDALDFETEHKLDP VFDSPRMSRRSLRIATTACTLGD 
GEAVGADSGTSSAVSLKNRAARTTKQRRS TNKSAFS INHVSRQV 
TSSGVSYGGTVSLQDAVTRRP PVLDES W IREQTTVDHFWGLDDD 
GDLKGGNKAAIQGNGDVGAGAATGHNGFFCSNGNMLS2RFCDVLT 
AHPAAPG PVSRVYSRDRNQKCDD CKGKRHLDAHPGRAGTLWHI W 
ACAG YFLLQ ILRRIGAVGQAVS RTAWSALWLAVVAP GXAASGVF 
WWLGIGWYQF\rrLISWLNVFLLTRCLRNICKFLVLLI?LFLLLG 
LS LRGQG \NFFS FLP VLNWASMHRTQRVDD P QDVFKPTTSRLKQ 
PLQGDSEAFPWHWMSGVEQQVASLSGQOCHHGENLREIjTTLLQK 
LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 
HLEDILGKLREKSEAIQKELEQTKQKTI s a vgeqllptvehlql 
ELDQLKSELSSWRHVKTGCETVDAVQERVDVQVREMVKLLFSED 
QQGGSLBQLLQRFSSQFVSKGDLQTMLRDLQLQILRNVTHHVSV 
TKQLPTS EAWSAVSEAGASGI TEAQARA I VNSALKL YSQDKTG 
MVDFALBSGGGSILSTRCSETYETKTALMSLFGIPLWYFSQSPR 
WIQ PDI YPGNCWAFKGSQGYLWRLSMK I HPAAFTLE H T PKTL 
SPTGNISSAPKDFAVYGriENEYQK15GQLT,GQFTYDQDGESLQMF 
OALKRPDDTAFQIVELRIFSNWGHPEYTCLYRFRVHGEPVK 
rPNDREPPPQkPPSSRRASHLAQEITSAASLGDQTQiLGSLTTA " 
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SEQ 
ID 
NO; 


1 Predicted 

1 beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seamen t contaTn^ nn oT7»»iaT*^ssirm 1 

^ oBguis»«. muudimng signal peptide 

(A«Alanine, C= Cysteine, D-Aspartic Acid, E» 

Glutamic Acid, F« Phenylalanine, G-Glycine, 

H^Histidine, I«Ieoleucine, K=Lysine, 

L^Leucine, M-Methionine, N=Asparagine , 

P-Proline, Q=Glut amine, RaArginine, 

S=3erine, T=Threonine, VoValine, 

H=Tryptophan, Y»Tyrosine, X=Unknown, *«Stop 

Codon, /^possible nucleotide deletion, 

\spossible nucleotide insertion) 


6020 






PVITSAIRSMPGISSQILTNAQGQVIGTLPWVVNSASVAAPAPA 
QSIOVQAVTPQLLLNfAQGQVIATLAS S PLP PPVAVRK\ PSTPES 
LLKSBVQPlKPTPTVPQPAWIASPAPAAKPSASAPrPITCSET 
PTVSQLVSKPHTPSLDEDGINLEEIREFAKNFK1RRLSLGLTQT 
QVGQALTATEGPAYSQS AI CRFE KLDI T P KS AQKLKP VLEKWLN 
EAELRNQEGQQNLMEFVGGEPSKKRKRRTSPrPQAIBALNAYFE 

KNPLPTGQBITEIAKELNYDREWRVWFCNRRQTLKNTSKLNVF 
QIP 




4953 


£49 


EAIQFEVSIGNYGNKFDTTCKPIASTTQYSRAVFDGNYYYYLPW 
AHTKPVVTlJTSYWEDISHRLDAVNTLLAMAERLO/rNIEALKSGI 
QG KIPANQLAELWLKLI DEVI EDTRYTL P IiTEGKANVTVLDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTIAErEDWLDKLMQLTE 
EPQNSy.PD III WMIRGE KRLAYARI PAHQVLYSTSGENASGKYC 
GKTQTI FLKYPQE KNNGPKVPVELRVN I WLGLS AVE KKFNS FAE 
GTFTVFAEMYBNQALMFGKWGTSG LVGRHKFSDVTGKI KLKREF 
FLPPKGWEWEGBWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDBKGWEYGI TIPPDHKPKSWVAAEKMYHTHRRRRLVR KRKKD 
LTQTASSTAGAMEE LQDQEGWB YASLIGWK FHW KQRS SDT FRRR 
RWRRKMAPS BTHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TT^GANTPIVSOCFDRDYIYHLRCYVYQARNLLALDKDSFSDP 
YAHI CFLHRSKTTK I IHSTLNPTWDQTI IFDBVEIYGEPQTVLQ 
NPPJCVIMELFDNDQVGKDBFLGRSIFSPVVKLNSEMDITPKLLW 
HPVMNGDKACGDVLVTAELILRGKDGSNLP ILPPQRAPNLYMVP 
QGIRPWQLTAIEILAWGLRNMKNFQMASITSPSLWBCGGBRV 
ESWIKWLKKTPNFPSSVIiFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGQCTIBRLDRFRCT>PYAGKEDIVPQLKASLLSAPPCR 
D I V I EM BDTKPIiLASKCLSSMSTALS XMAS PATVHLTEKEEE IV 
DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRI YIVRGLELQPQDNNGLCDPYI KITLG 
KKVI E \ DRDH Y I PNTLNP VFGRMYBLSCYLPQEKDLKIS VYDYD 
TFTRDEKVGBTIIDI.^NPPV r.<;i? pn\ cwrrVTOppv^ir^i,™,, 

RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSR2RYGGRDYSLDB 
FEANKIIiHQHLGAPBERIiALHIIiRTQGIiVPEHVETRTtJISTFQP 
NIS\RYYLRVI IWNTKDVILDEKS ITGEEMSDI YVKGWIPGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCI VAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC/RGLDMI PDLKAMNPLKAICTASLFEQ KSMKGNW 
PCYAEIOXSARWIAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6021 


4953 


549 

1 

I 


EAIQFEVSIGWYGNiCFDTTCKPLASTTQYSRAVFDGNYYYYLPM 
AHTKPWTLTSYWEDI SHRLDAVNTLLAMAERLQTNIEALKSGI 
QGXIPANQLAELWLKLIDBVI EDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTB 
E PONS M PD 1 1 1 WM IRGEKRLAYARI PAHQ VLYSTS GENAS GKYC 

GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKXFNSFAE 
GTFTVFAEMYENQALMFGKWG-TSGLVGRHKFSDVTGKI KLKREF 
FLPPKGWEWEGEW3VDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWK PAE DTYTDANGDKAAS PS ELTCPPGWE WEDDAWS YDI NR 
AVDEKGVTEYGITIPPDHKPKSWVAABKMYHTHRRRRLVRKRKiCD 
LTQTASSTAGAKEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGANT P TVS CN FDRD YI YHLR CYVYQARNLLALDKDS FSDP 
YAHICFLHRSKTTB I IHSTLNPTWDQTI I FDEVEI YGEPQTVLQ 
VP PKVTME LFDNDO/VGKDEFLGRS I FSP WKLNSEMD I TP XLLW 
iPVMNGDKACGDVLVTAELILRGKDGSNLPILPPQRAPHLYMVP 
2G IRPWQLTAIEILAWGLRNMKNFQMAS ITSPSLWECGGERV 
2SWIKNLKKTPNFPSSVLFKXVFLPKEBLYMPPLVIKVIDHRQ 
^RKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
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SKQ~ 

ID 
NO: 


— Predicted — 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreaictea en a 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(AoAlanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine. I«Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, +=3 top 
Codon, /-possible nucleotide deletion, 
\=»possible nucleotide insertion) 








DIVIEMBDTKPLLASKCbSi5MSTAI*SKMASPATVHLTEKBBEIV 
DWWSKFYASSGBHE KCGQ Y IQKG YSKLKI YNCELBNVAE7EGLT 
DFSDTFKL YRGKSDENEDPS WG E FKGSFRI YPLPDDPS VPAPP 
ROFRELPDSVPQBCTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVIE\DRDHYIPNTLNPVFGRMYELSCYLPQEKDLKISVYDYD 
TFTRDEKVGETT IDLENPF\l*SRFG\ SHCG\ IPBEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGPPQPILSEDGSRIRYGGRDYSLDE 
PEANKILHQHIiGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
NIS\RYYLRVTIWKTKDVILDBKSITGEEMSDIYVKGWIPGNBB 
NKQKTDVHYRSLDGBGNFNWRPVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFR1PPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKSPGGN C /RG LDMI PDLKAWN PLKAKTAS LFEQKSMKGWW 
PCYAEKDGAR VMAG KVEMTLEILNE KEADERPAG KGRDE PNMNP 
KIiDLPNRPETSFLWPTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
LPVAVLLYSLPNYLSMKIVKPtfV 


6022 


4953 


549 


EAIQFEVSl^NYGMKFDTTCKPIiASTtqYSRAVFDGNYYYYEtPW"" 
AHTKPVVTLTSYWEDISHRLDAVNTLIiAMAERLQTNIEALKSGI 
CX3KIPANQLAELI^KL1DBVIEDTRYTLPLTEGKANVTVLDTQI 
RXLRSRSLSQIHEAAVRMRSEATDVKSTLAETEDWLDKliMQLTE 
EPQNSMPDI I IWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GKTQTI FLK YPQE KNNGP KVP VELRVN I WLGLSAVEKKFNS FAE 
GTFTVFAEM YENQA1MFGKWGT5GLVGRHKPSDVTGKI KLKREF 
FLPPKGWEWEGEWrVDPERSLLTBADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWS YD INR 
AVDEKGKEYGI T I PPDHKPKS WVAAEKMYHTHRRRRLVR KRKKD 
LTQTAS STAGAMEELQDQEG W EYAS L IGWKFHWKQRSS DTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTBDGDEKSLEKQKHSA 
TTVFGAmTlVSOTFDRDYIYHLRCYVYQARNLLALDKDSFSDP 
YAHICFLHRSKTTEIIKSTLITPTWDQTIIFDEVEIYGEPQTVLQ 
NPPKV IME LFDNDQ VGKDE FXiGRS I FS PWKLNSEMD I T PKLLW 
H PVMNGDKACGDVL VTAEL I LRGKDGSNLPILP PQRAPNLYM VP 
QGI R P WQLTAI EI LAWGIiRNMKNFQMAS ITSPSLWECGGERV 
ESWI1TOLKKTPNFPSSVLFMXVFLPKEELYMPPLVIKVIDHRQ 
FGR KPWGQCT I ERLDRFRCDP YAGKED I VPQLKAS LLS APPCR 
DIVIEMEDTKPLLASKCLSSMSrALSKMASPATVHLT3KEBEIV 
DWWSKFYASSGBHE XCGQY IQKGYSKLKI YNCELENVAEFEGLT 
DFSDTFKLYRGKSDZNEDPSWGEFKGSFRI YPLPDDPS VPAPP 
RQFREL PDS VPQBCTVR I Y I VRGLELQPODNNGLCD P YI KI TLG 
KKVIE\DRDHYIPNTLNPVFGRWYELSCYLPQEKDLKISVYDYD 
TFTRDEKVGETI IDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQP1LSEDGSR1RYGGRDYSLDB 
FEANKILHQHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
Nrs\RYYLRVIIWNTKDVILDEKSITGEEMSDIYVKGWiPGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
S IDQTEFRI PPR\LI IQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 
ik&juajAKVkAG KVEMTLEILNE KEADERPAGKGRDEPNMNP 
KLDLPNRPETS FLW FTNPCKTMKFI VWR R FKWVI I GLL FLL I L L 
LFVAVLLYS LPNYLSMKI VKPNV 


6023 


102 


B16 


SQELGMF VELNNLLNTTPDRAEQGKLTLLCDAKTDGS FL VHHFL 
SFYLKANCKVCFVALIQS FSHYS I VGQKLGVSLTMARERGQLVF 
LEGL/XVCSGR\VFQAQKEPHPLQFLREANAGNLKPLFEFVREA 
LKPVDSGEARWTYPVLbVDDLSVLLSLGMGAVAVLDFIHYCRAT 
VCWELKGNMVVTjVHDSGDAEDEENDILLNGLSHQSHLILRAEGL 
ATGFCRDVHGQLR I LWRRPSQPAVHRDQS FT YQYKIQDKS VS FF 
AKGMSPAVL 


6024 


3 


3260 


flsflcyprfrclfclqfaieasrMeqlnelellmeksfweeab " 
lpaelfqkkwasfprtvi^tgmdnrylvlavntvqnkegncek 
rlvitasqslenxelcilrndwcsvpvepgdi ihlegdctsdtw 

1 1 DKDFGYL I L YPDML I SGTS IAS S IRCMRRAVLSETFRSSDPA 
rRQMLIGTVLHEVFXJKAXNNSFAPEKLQELAFGTIQBIRHXKBM 
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SEQ~" 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seyuiuac containing signal peptide" 
(A=Alanine, C-Cystcine, D»Aspartic Acid, Es 
Glutamic Acid, Phenylalanine, G=*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N«Asparagine, 
P=Proline, QsGlutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyroeine, X- Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 






» 




YRLNLSQDEIKQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQLSL 
PSDN S KDNS TCNIE WXPMD I E ES I WS PR FGLKGKI DVTVGVK I 
HRGYKTKYKIMPLELKTGKESNS IEHRSQWuYTLLSQERRADP 
EAGLLLYLKTGQMYPVPANHLDKRELLKLRNQMAFSLFHRISKS 
ATRQKTQLASLPQI IEEEKTCKYCSQIGNCALYSRAVEQQMDCS 
S VP I VMI.P K I EEETQHI»KQTHLE YFSLW CLMLTLB S QSKDN KKN 
HQNIWLMPASEMEKSGSCIGNL IRMEHVKI VCDGQYLHNFQCKH 
GAI PVTNLMAGDRVI VSGEERSLFALSRGYVKEINMTTVTCLLD 
RNLSVLPESTLFRLDQEEKNCDIDTPLGNLSKLMENTFVSKKLR 
DLI I DFREPQ F I SYLSS VLPHDAKDT VACI LKGLNKPQRQAMKK 
VLLS KDYTLI VGMPGTG KTTT I CTLVRILYACGFS VLLTS YTHS 
AVDNILLKLAKFKIG FLRSR\Q IQKVHPA1QQFTEHEI CRSKSI 
KS\LALLEELYTSQLIDATTCMGINHPIPSRKIFDFCIVDEASQ 
ISQPICLGPLFFSRRFVLVGDHQQLPPLVLNREARALGMSESLF 
KRLEQNKSAWQI,TVQYRMNSKIMSLSNKI*TYEGKLECGSDKVA 
NAVINL RHFKDVKLBLEFYADYS DNPWLMGVFEPNNPVCFLNTD 
KVPAPEQVEKGGVSN VTEAKLI VFLTSI FVKAGCSPSDIGI IAP 
YRQQLKI INDLLARS IGMVEVNTVDKYQD\RDKS I VLVSFVRSN 
KDGTVGELL KDWRR T »NVAI TRAKHKL I LLGCVPS LNCYPPLEKL 
LNHLNSEKLI IDLPSREHESLCHILGDFQRE 




6025 


3977" " 




GGFPAQSDHLPPVFPLRSDLLITMSTLYVSPHPDAFPSIiRALIA 
ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 
GLWVWGATAVAQLLWPAGLGGPGGSRAAVLVQQWVSYADTELIP 
AACGATLPALGLRSSAQDP QAVLGALGRALS PLEE WLRLHT YLA 
GBAPTLADLAAVTALLL P FRY VLDPPARRI WMNVTRWFVTCVRQ 
PE FRAVLGEWIi VSGAR P LS HQPGPBA PALPKTAAQLXKEAKKR 
EKLBKFQQKQKIQQQQPPPGEKKPKPEXREKRDPGVITYDIiPTP 
PGEKKDVSGPMPDS YS P RYVE AAW YP W WEQQGFFKPEYGRPNVS 
AANP RGVFMMC I P PPNVTGS LHLGHALTNAI QDS LTRWHRMRGE 
TrLWNPGCDHAGlATQVWEKKLWREQGLSRHQLGREAFLQEVW 
KWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVR 
LHEEGIIYR5TRLVNWSCTLNSAISDIEVDKKELTGRTLLSVPG 
YKEKVE FGVLVSFAYKVQGSDSDEEVWATTRI ETMLGDVAVAV 
H PKDTR YQHL KGKNVXH P FLSRS LP IVFDEFVDMDFGTGAVKIT 
PAHDQND YEVGQRfTGLEAI S IMDSRGAL INVPP PFLGLPRFEAR 
KAVLVAL KERGX.FRG I EDNPMWPLCNRS KDVVEPLLRPQW YVR 
CGEMAQAASAAVTRGDLRI Ij PERHQRTWHAVJMDNI RE \ WCMFPG 
KLWWG \ HR \ I PAYFVTVSDPAVP PGEDPDGRYWVSGRNEAB ARE 
KAAKEFG VS PDKISLQQDEDVLDTWFS SGLFPLS I LGWPNQS ED 
LS VFYPOTLLETGHDI LFFWVARMVMLG LKLTGRLP FREVYLHA 

TVPn&Wm? VMClf QTJ^*KTtyTT\T^T M T m (VIT rninf T MAMr «.«.» 

* jvivt'j jxvo jLAjii V l.LftrLtU V J. j.\3± a JjLfta l iMrJQTiT »N-"Njjr'PiJ 

EVEKAKBGQKADFPAG1PECGTDALRFGLCAYMSQGRDINLDVN 
R I LGYRHFCNKLWNATKFALRGLGKGFVPS PTSQPGGHESLVDR 
WIRSRIi TEAVRLSNQG FQA YD FPAVTTAQ YSFWL YELCD VYLEC 
LKPVLNGVDQVAAECARQTLYTCLDVGIjRLLSPFMPFVTEELFQ 

rlprrmpqappslcvtpypbpsecswkdpeaeaaleialsitra 
vrp\lradynlhpesgptcflevad\eatgalasavsgyvqgpg 
q aq vwavae p wglpap \ qg cavalasdrcsi \hlqlqg \lldp 

ARE LG\KLQ \ AKRVEAQ\ RQA0\ RLR\ERRA\ ASGNP VKVP ti \ E 
VQEADEAKLQQTEAELRKVDEAIALFQKML 


6026 


2674 


S14 


GPlTb'UCKt^KMtoMPLRIHVLLGLAITTLVQAVDKKVDCPRLC " 
TCEIRPWFTPRSIYMEASTVDCNDLGLLTFPARLPANTQILDLQ 
TNNIAKJEYSTDFPVNLTGLDLSGNNL5SVTNrWGKKMPQLLSV 
YLEENKLTELPEKCLSELSNLQELYINHNLLSTISPGAFIGLHT? 
LLRLHLNSNRLQMINSKWFDALPNLEILMIGENPI IRIKDMNFK 
PLINLRSLVIAGINLTE I PDNALVGLEMLES I S FYDNRL I KVPH 
VTUjQKVVNLKFLDLN'KNPINRIRRGDFSNMLHLKELGINNMPEL 
I S IDSLAVDNLPDLRKI EATNNPRLS YIHPNAPFRLPKLESLKL 
NSNALS ALYHGTIESL PNLKE I S I HSNP IRCDCVIRWMNMNKTN 
IRFMEPDSLFCVDPPEFQGQNVRQVHFRDMMEICLPLIAPESFP 

snlnveagsyvsfhcratanepqpbiywitpsgqkllpntXltd 
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SBO 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6027 



5254 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



4148 



6029 



3432 



Amino acid, segment containing signal peptide 
(A=Aianine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K-Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
PaProline, Q«Glutanine, R=Arginine, 
S=*Serine, T=Threonine, V=*Valine, 
N^Tryptophan, Y=Tyrosine, X=Unknovn, *=stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide in sertion) 

KFYVHSEGTLDINGVTPKEGGLYTdlATNLVGADLKSVMIKVDG" 
SFPQDNNGSLNIKI RDIQANS VLVS WKAS SKILKSSVKWTA FVK 
TENSHAAQSAR IPS DVKVYNLTHLNP STB YKI CI D I PTI YQKNR 
XKCVNVTTKGLHPDQKEYBKNNTTTLMACLGGLLGI IGVI CLI S 

CLSPEMNCDGGHSYVRNYLQKPTFALGELYPPLINLMBAGKEKS 
TSLKVKATVIGLP TNMS 

GGRRAPGRPGRSIKDEEBETVFRBWS FSPDPLPVRYYDKDTTK 

PISFYLSSLEEIaLAWKPRLEDGFNVALEPXACRQPPLSSORPRT 

LLCHDMMGGYLDDRFIQGSVVQTPYAFYHWQCIDVFVYFSHHTV 

TIPPVGWTNTAHRHGVCVLGTFITEWNEGGRLCEAFLAGDERSY 

QAVADRIiVQlT\RFFRFDGWLINtENSLSLAAVGNMPPFLRYLT 

TQLHRQVPGGLVLWYDSWQSGQLKWQDEUIQHNRVFFDSCDGF 

FTWYNVJREEHLERMLGQAGERRADVYVGVDVPARGNWGGRFDT 

DKVGGGFRPRASGPVPPLGPHFLMDLPFPSAPQRNDSSCSSQSG 
DPVALRNRCPAPAKLCPH 



"3533- 



NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKPLGGLPETAKEQ 
LrrVHMEVCAAFEAKEETYKSLMQKGQQMLARCPKSAETNIDQDI 
WNLKEKWESVETKLNER\KT\KLEEALNIA\MEFKNSL\QDFIN 
WLTQAEQTLNVASRPS LILDTVLFQIDEHKVFANEVNSHREQI I 
ELDXTGTHLKYFSQKQDVVLIXKLLISVQSRWEKVVQRLVERGR 
SLDDARKRAKQFHEAWSKLMEWLEESEKSLDSELEIANDPDiQK 
TQIJVQHKEFQKSLGAKHSVYDTTKRTGRSLKEKTSLADDNLKLD 
DMLS3LRDKWDTI CX3KS VERQNKLEEA\ LLFSGQFTDALQAL I D 
WLYRVEPQLAEDQPVHGDIDLVMNLIDNHKAFQKELGKRTSSVQ 
ALKRSARBLI EGSRDDS S WVKVQMQELS TRWET VCALS ISKQTR 
LEAAI»RO^EFHSVVHALI^WIAEAEQTLRFHGVLPDDEDALRT 
LIDQHKEFMKKLEEKRAELNKATTMGDTVLAICHPDS 1TTI KHW 
J TZ IRARFEEVLAWAKQHQQRLASALAGLIAKQELLEALIiAWLQ 
KAETTIiTDKDKEVI PQEIEEVKALIAEHQTFMEEMTRKQPDVDK 
VTKTYKRRAADPSSLQSHIPVLDKGRAGRKRFPASSLYPSGSQT 
QIETKNPRVNLLVSKWQQVWLLALERRRKLNDALDRLEELREFA 
NFDFDIWRKKYMRWMNHKKSRVMDFFRRIDKDQDGKITRQEFID 
GILSSKFPTSRLEMSAVADIFDRDGDGYIDYYEFVAALHPNKDA 
YKP ITDADKIEDE VTRQVAKCKC AKRFQVEQ IGDNKYRPFLGNQ 
FGDSQQLRL VR I LRST VM VRVGGGWMALDE FI>VKNDPCRAKGRT 
NKELREKFHADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 
SQAAQAASPQVPATTTPK1 LHPLTRXJ YGKPWLTNSKMS TPCKAA 
ECSDFPVPSABGTPIQGSKLRLPGYLSGKGFHSGEDSGLITTAA 
ARVRTQFADSKKTPS R PGSRAGS KAGSRASSRRGSDASDFD I S E 

IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKIPTPQRKSPASK 
LDKSSXR 



IMPCGSSRDL^GCWTHPNEP V^DLSYFDCIESVMKySKViiGESM 
AGI S QNAKTGDLPAPGECVG IASKALCGLTEAAAQAAYLVG I FT> 
PNSQAGHQGLVDP IQFARANQAIQMACQNLVDPGS SPSQVLSAA 
T I VAXHTS ALCNACRIAS S KTANPVAKRHFVQS AKE VANSTANL 
VKTI KALDGDFSE DNRNKCR IATAPL1 EAVENLTAFASNPEF VS 
I PAQ I SS EGSQAQE PILVS AKP MI»E S SS YLIRTARSLAJNPKDP 
PTWSVLAGHSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 
I RD IEQASLAAVSQS EiATRDD I S VEALQEQLTS WQE IGHL I D P 
IATAARGEAAQLGHKGTQLAS YFEPLILAAVGVASKI LDHQQQM 
TVLDQTKTLABSALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
E AVDD I MVTLMEAAS EVGLVGGKVDAI AEAM3KLDEGTPPEPKG 
TPVDYQTTVVKYSKArAVTAQEMMTKSVTNPEELGGLASQMTSD 
YGHIAFQGQMAAATAE PEEI GFQ IRTRVQDLGHGC3 FLVQKAG\ 
ALQVCPTDS YTKREL I ECARAVTEKVS L VLSALQ AGNKGTQACI 
TAATAVSGI I ADLDTT 1 MFATAGTLNAENS ET FADH RENT LKTA 
K^VEDTKLLVSGMSTPDKI^QAAQ^AATITO 
SLGSDDPETQWLINAI KDVAKALSDLISATKGAASKPVDDPSM 
YQLKGAAKVMVTNVTSLIjKT VKAVEDEATRGTRALEATI ECI KQ 
ELTVFQS KDVPEKTSS PEES I RMTKGITMATAKAVAAGHSCRQB 
DVIAiaNLSRKRVSI^lLTACKOASFHPDVSDEVRTRAIiRFGTBC 
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ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
ajnino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, Inlsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
h= i ryptopnan , x= tyrosine, x=unKnowr. , *-stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








TLGYLDLLEHVLV I LQKPTPELKQQLAAFS KRVAGAVTELIQAA 
EAMKGTBWVDPEDPTVIABTBLLGAAAS IEAAAKKLEQLKPRAK 
PKQADETLDFBBQILEAAKSIAAATSALVKSASAAQjRELVAQGK 
VGS I PANAADDGCJWSQG LI SAARMVAAATS S 1»CE AANAS VQGHA 
S S E KL3 S S AKQ VAAS TAQLLVACKVKADQDS EAMRRLQAAGNAV 
KRASDNLVRAAQKAA FG KADDDDVWKTKFVGG I AQI I AAQEEM 
I»KKZR£LEEARKICLAQIRQQQYKFLPTELREDEG 


6030 


3 


1777 


FPGRGSPALQLBVLICLGLMGLERALNVLAPIFYRNIVNLLTEN 
APNNS LAWTVTSYVFLKFEiQGGGTGSTGFVSNIjRTFLW I RVQQF 
TSRRVELLIFSHLHEIiSLRWHLGRRTGEVLRIADRGTSSVTGLL 
S YLVFNVI PTLADI IIGII YFSMFFNAWFGLI VFLCMSLYLTXT 

1wtewrtkprramntqenatraravdsllnfetvkyynaesyb 
veryreai ikyqglewkssaslvllnqtqnlviglgllagsllc 
ayfvteqklqvgdyvlfgtyiiqlymplnmfgtyyrmiqtnfid 
menmfdllkk\etbvkdlpgagpfrfqkgriepenvhpsyadgr 
btlqdvsftvmpgqtlalvgpsgagkstilrllfrfydissgci 
ridgqdisqvtqalfrfshwelcpkdtvlfndtiadnirygrvt 
agndsveaaaqaagihdaimafpegyrtqvgerglklsggeKqr 

VAI ARTILKAPGI ILLDEATSALDTSNERAI QASLAKVCANRTT 

X WAHRLSTWNADQILVI kdgci vergrheallsrggvyadmw 

QLQQGQEE'rSEDTKPQTMER 


6031 


160 


1694 


lrmsenldksnvneagksksndsbegledavegadbalqkaiks 
dsss pqrvqrphsspprfvtveelletargvtnmalaheivvng 
dfqikpvelpenslkkrvkeivhkafwdcl9vqlsbdppaydha 
iklvgeiketllsfllpghtrlrnqitevldldlikqeaengal 
dis klaefi igmmgtlcapardebvfckrikdi kei vplfrer fsv 
ldlmkvdmanfaiss irphlmqqsveyerkkfqei lerqpnsld 
fvtqwleeasedlmtqkykhalpvggmaagsgdmprlspvavqn 

YAYLKLLKWDHLQRPFPETVLMDQSRFHELQLQ\REQLT1LGAV 
LLVTFSMAAPGISSQADFAEFCLKMIVKrLLTDMHLPSFHLKDVL 
TTIGEJCVCLEVSSCLSLCGS S PFTTDKETVLKGQIQAYASPDDP 
IRR IMES R ILTFLET YLASGHQKP&P TVPGGLSP VQRELBEVAI 
KFARLVNYNKNVFCPYYDAILSKILVRS 


6032 


39 


2415 


AARLCRAQPTKSAWMIRDLSKM YPQTRHPAPHQPAQPFKFT IS E 
S CDRI KEEFQFLQAQYHS LK I .ECEKLASE KTEMQRHYVM YYEMS 
YGTiMI EMHKQAEI VKRLNAI CAQVI P FLS QEHQQQ WQAVERAK 
QVTMAELNAIIGQQQLQAQHLSHGHGLPVPLTPHPSGIiQPPAIP 
PIGSS AGLLALSSALGGQSHLPI KDBKKHHDNDHQRDRDS I KSS 
SVSPSASFRGAEKHRNSADYSSESKKQKTBEKBIAARYDSDGEK 
SDDNLWDVSNED PSSPRGS PAHSPRENG LDKTRLLKKDAP ISP 
ASIASSSSTPSSKSKELSLNEKSTTPVS KSNTPTPRTDAPTPGS 
NSTPGLR PVPGKP PG VDPLAS S LRTPMAVPCPY PTP FGI VPHAG 
riLVLaitij To P(jAA i AGJjHNI S PQMS AAAAAAAAAAA YGRS PWG FD 
PHHHMRVPAI PPNLTG1 PGGKPAYSFHVSADGQMQPVPFPPDAL 
IGPG I PRHARQINTLKfHGEWCAVTISNPTRHVYTGGXGCVKVW 
DISHPGNKS PVS QLDCLNRDNY I RSCRLLPDGRTLI VGGEAS TL 
S IWDLAAPTPRI KAELTSSAPACYALAI 3 PDSKVCFSCCSDGNI 
AVWDLHNQTLVRQFQGHTDGASC I DI SNDGTKLWTGGIJ3NTVRS 
W\DLREGRQITOHD/FFTSPVFSLGYCP\TEBWLAVGMENSN\V 
EVLHVTKPDK^QLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 
W\RTPYG\ASIP\QSKBSSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6033 


39 


241S 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTISE ' 
SCDRIKEEFQFLQAOVHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLNIEMHKQAE I VKRLNAI CAQVIPFLSQEHQQGWQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLP VPLTPHPSGLQPPA1 P 
PI GSSAGLLALSSALGGQSHL PI KDEKKHHDNDHQRDRDS IKS S 
S VS PSAS FRGAEKHRNSADYS SES KKQKTEEK3I AARYDSDGB K 
SDDNLWDVSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAPISP 
ASIASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


<*m.>aiiv auj.u aeyiut:in_ coauaiiung signal peptide 
{A=Alanine, C=Cysteine, D-Aapartic Acid, B» 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H-Histidinc, I^Isoleucine, K=Lyeine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glu t amine , R»Arginine, 
S=Serine, ^Threonine, V= Valine, 
^Tryptophan, Y=Tyrosine, X=Unknovn, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MNGELTS PGAA YAGLHNIS P OMSAAAAAAAA a a a vr pqpmy^pS" 

PHHIJMRVPAI PPNLTGIPGGKPAYSFHVSAD3QMQPVPFPPDAI* 

IGPGIPRHARQINTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 

DISHPGNKSPVSQLDCLNRDNYIRSCRIiLPDGRTLIVGGEASTL 

SIWDLAAPTPRI3CAELTSSAPACYALAISPDSKVCPSCCSDGNI 

AVWDLHNQTLVRQFQGiTTDGASCIDlSNDGTKLWTGGLDNTVRS 

MXDLRBGRQLQQHD/FFTSPVFSLGYCPXTBBWLAVGMENSNXV 

EVLHVTKPDKYQLHLHESCVLSLKFAHCGKNFNVSTGKDNLIiNA 

W\RTPYG\ASIF\QSK3SSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2S83 


714 


ESGRRRRbKRRRSPCPGTAGGPGETNPGPGACPRGPREEAAAAM"" 
EIAPQEAPPVPGADGDI3EAPAEAGSPSPASPPADGRLKAAAKR 
VTFPSDED I VSGAVEP KDPWRHAQNVTVDE VI GAYKQACQ KLNC 
RQI PKLLRQLQEFTDLGHRLDCLD LKG B KLD Y KTCEALEEVFKR 
LQFKWDJjEQTNLDEDGASALFDM IE Y YESATHLNI S FNKH IGT 

RGWnAAAffMMPIfTCPTTWT \ r\ t\ mtn<riT r Mint - _ _ _ _ ___ 

«v «wrt/v\iinn*ttt. i tyi^Lfijxu \DAKNTPLLDHSAPFVARALRIRSS 
IAVLHLENASLSGRPLMLLATALK>INKNLRELYL\ADNKLNGIiQ 
DSAQLGNLLKFNCSLQILDLRNNHVLDSGIAYICBGLKEQRKGL 
v a w \ v jjrmi^v-ui n i^i'^r iij^nl Uoijc^TijITLCHNPIGNEGV 
RHLKNGLISNRSVLRLGLASTKLTCBGAVAVAEFIAESPRLLRL 
DLRENEI KTGGLMALS LALKVNHSLLRLDLDREPKKEAVKSFI E 
TQKALIJ^ IQNGCKRNLVLAREREBKEQPPQLSASMPETTATEP 
QP DDEPAAGVQNGAPS PAP S PDS D S DSDS DGEEBEBEEGERDET 
PSGAIDTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFALALPPEP , 
PPGPEVKGGSCGIiEHELSCSKNEKELEELLLEASQESGQETL 


6035 


15 


404 


S VTYLGI ILHKNTGALPADPVQLI SQTPT PSTKQQLLSFLGMVG' 

yfylwipgfailtkplckltkeniiadaidp ks fshssfrslkta 
lenastlalpdssqpf\slhtaevqgcvveiltqglgplpv 1 


6036 


1745 


356 


L PD VEKIX3RRRGRKMDS VE KGAATS VSN PRGRPS RGRP PKLQRN 

fiRGGCV^RRVPTTDDUT jvrt tt t\ nrr* c t tit wirtrrrr | 

^^~VvmijVJ5AJ^HJytA*UjXJj^OGSKGIP 1 

VLPJVALDSGAFQSVWVSTDHDEIEOTAKQFGAQVHRRSSEVSKD 
SSTSLDAIIEFLNYHNEVDIVGNIQATSPCLHPTDLQKVAEMIR 
EEGYDSVFSWRRHQFRWSEI QKGVREVTEPLNLNPAKRPRRQD 
WDGEL YENGS FYFAKRHL I EMGY LQGG KMAYYEMRAEHS VD IDV 
DIDWPIAEQRVLRYGYFGKEKLKEIICLLVCNIDGCLTNGHIYVS 
GDQKEI IS YD VKDAIGI SLIiKKSGI E VRLI SERACS KQTLS SLK 
IJDCKMEVSVSDKLAVVDEWRKEMGLCWKEVAYLGNEVSDEECIiK 
RVGLSGAPADACSTAO"KAVGYTPKr*wnnpritt\ td9s>m?utM r t 

MEKGLINFMPKNRNLAVNIGEKK j 


6037 


2936 


1919 


WTSV7WMSSVLTI1*LFSLQGNKMLNYSAPSAGGYLLPRKPVGTPaH 
GGGFPRRHSVTLPSSKFRQNQIiLSSLKGEPAPALSSRDSRFRDR 

sfseggerllptqkqpgggqvnssrykt\elcrpfeengackyg 
dkcqfahgihelrsltrhpkyktelcrtfhtigfcpygprchpi 
hnaeerralagardlsadrprlqhsfsfagfpsaaataaatgll 1 
DS PTS itpppilsaddllgsptlpdgtnnpf\afssqelas l?a 
psmglpgggspttflfrpmses phmfds ppspqdslsdqeg yls 

SSSSSHSGSDSPTLDNSRRLPIFSRIiSISDD | 


6038 


1450 


426 


SSALQEFGTRIWTFGVPLPHRRKQIISCNICQLRFN^DSGAAAlH 
YKGTKHAKKLKALE^KNKQKSVTAKDSAKTTFTSITTNTINTS 

sdktj3gtagtpai sttttvei rkss vmtteits kvekspttatx3 
nsscpstbteebkakrll\ycslckvavnsasqleaidisgtkhk 
tmlearngsgtikafpragvkgkgpvnkgntglqnktfhceicd 

VHVNSETQIiKQHISSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFS KEPSKPLAPRI L PNPIiAAAAAAAAVAVS S PFSLRTAP 
AATLFQTSALPPALLRPAPGPIRTAHTPVLFAPY j 


6039 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEKAAKITELINKL "I 
NPLDEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEPITETASP 
RKrEDSFYNNSYWPFKEVQTPQYLNPFDEPEAFVTIKDSPPQST 
KKKNIRPVDMSKYLYADSSKTEEEELDESNPFYEPKSTPPPNNL 
VNPVQELETBRRVKR KAPAPP VLS PKTGVLNBNTV5 AGKDLS TS | 
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SEQ 
ID 
NO: 


""Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acia segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, Phenylalanine, GoGlycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Axginine, 
S»Serine, T=Threonine, V-Valine, 
W=Tryptophan / Y^Tyrosine, X« Unknown, *oStop 
Codon, /=posBible nucleotide deletion, 
\ -possible nucleotide insertion) 


6040 






PKPS PI P S P V LGRKPNAS QS LL VW CKE VT KN YRGVKI T Nf F TTS 
RNGLSFCAILHHFRPDLIDYKSLNPQDIKENNTCKAYDGFASIGI 
SRLLBPSDWLIAIPDKLTVm'YLYQIRAHFSGQELNVVQlEEN 
SSKSTYKVGNYETDTNSSVDQEKFYAKLSDLXREPBLQQPISGA 
VDFLSQDDS VPVNDSGVGESBSEHQTPDDHLS PSTAS P YCRRTK 
SDTEPQKSQQSSGRTSGSDDPGICSOTDSTQAQVLLGKKRLLKA 
ETLELSDLYVSDKKKDMSPPPICEBTDEQKLQTLDIGSNLBKEK 
LENS RSLE CRSDPBS P I KKTS LS PTS KLG YS YS RDLDLiAK K KHA 

SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQKELKERARVL 
LBQARRDAALKAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRER 
ARQLXAEARSGGKMSELPSYGERAAEKLKBRSKASGDENDNIEI 
DTNEEI PEGFWGGGDELTNLENDLDTPBQNSKLVDLKLKKLLE 
VQPQVANSPSSAAQKAVTESSEQDMKSGTEDLRTERLQKTTERF 
RKPWFSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 
EEKAAI TETQRKPSEDEVLNKG FKDS \ SQ YWGELAALENEQKQ 
IDTRAALVEKRbRYLMDTGRNTEEEEAMMQE WFMLVNK KNALIR 
RKNQLSLLEKBHDLERRYELLNRELRAMLAI EDWQKTEAQKRRE 
QLLLDE jVALVNKRDALVRDJ » n AQEKQAEEEDEHLBRTLEONKG 
KMAKKEEKCVLQ 


6041 


475 


1052 


PTALMTAPSCAFP VQFRQPS VSGLSQ ITKSLYI SNGVAANNKLM " 
LSSNQ I TM VINVS VEWNTtiYEDI QYMQVP VADS PNSRLCDF FD 
PIADHIHSVEMKQGR\TLLHCAAGVSRSAALCLAYLMKYHAMSL 
LDAKTWTKS CRP I IRPNSGFWEQL IHYEFQLFGKNT VHM VS S PV 
GMIPDIYBKEVRLMIPL 


6042 


2 


3886 


tbkdektahnx,envlihfwhrlseicvak:isepeadvesvlgvs 

NLLCVLQKPKGSJjKSSKKKNGKVRFADEILESNKENEKCVSSEG 
EKIECWELTTE PS LTHNS SGLLS PLRKKP LEDLVCKLAD I S INY 
VNER KS EQHLR FLSTLLDS FS S SR VFKMLiLGDB KQS I VQAKPLB 
I AKLVQKNPAVQFLYQ KL IGWLNEDQRKDFGFLVDI LYS ALRCC 
DNDMERKKVLDDLTKVDLKWNSLLKI IEKACPSSDKHALVTPWL 
KGDILGEKLVNLAnCLCNEDLESRVSSESHFSERWTU.SLVLSQ 
HVKNDYLIGDVYVER 1 1 VKLHBTLiFKTKKLS EABSSDSSVSFXC 
DVAYN YFS SAKGCLLM PSS BDLLLTL FQLCAQSKEKTHLPDFL I 
CKLKNTWLSGVNLLVHQTDSSYKESTFLHLSALWLKNQVQASSL 
DINSLQVLLSAVDDLLNTLLESEDSYLMGVYIGSVMPNDSBWEK 
MRQ S LPMQ WLHRPLLEGRLSLNYECFKTDFXEQDI KTliPSKLCT 
SALI^KMVLIALRKETVLBNNELBKIIAELLYSLQWCEELDNPP 
IFLIGFCEILQKMNITYDNLRVLGNMSGLLQLLFNRSREHGTLW 
Shi I AKZ» I LSRS IS SDE VKPHYKRKES FFPLTBGNLHTI QSLC P 

FLSKEEKKEFSAQCIPAXtLGWTKKDIiCSTNGGFGHLAIFNSCLQ 
TKS I DDGEUiHGILKI 1 1 SWKKEHEDI FLFS CNLSEAS PEVLG V 
NIEIIRFLSLFLKYCSSPLAESEWDFIMCSMLAWLBTTSENQAL 
YSIPLVQLFACVSCDLACDLSAFFDSrrLDTIGNLPVNLrSEWK 
E FFS QG IHS LLL PILVT VTGENKD VS ETS FQNAMLKPMCETLT Y 
I SKEQLLSHKLPARLVADQKTNLPE YLQTLLNTIAPLLLFRARP 
vyiAVrHMLYKLMPELPQYDQDNLKSYGDEEEEPAIiSPPAAliMS 
LLSIQEDLLENVLGCIPVGQIVTIKPLSEDFCYVLGYLLTWKLI 
LTFFKAASSOLRALYSMYLRKTKSLNICLLYHLFRLMPENPTYAE 
TAVEVPNKDPKTFFTBELQLSIRBTTMLPYHIPHLACSVYHMTL 
KDLPAMVRLWWNSSEKRVFNIVDRFTSKYVSSVLSFQE1SSVQT 
STQLFNGMTVKARATTREVMATYTIEDIVIELirQLPSNYPLGS 
1 1 VESG KRVG VAVQQWRN WMLQLS TYLTHQNGS I MEGLAL WKNN 
VDKRFEG VED CMI CFS VIHGFNYS LPKKACRTCKKKFKSA \ CLY 
KWFTSSNKSTCSLCRETFF 




1306 


253 J 
( 
( 
I 
C 
I 

c 


1AEIAPASPSD I lCAS VSNGDTTLLCSRRQSCGMNK VRQ VS LT YP 
3SPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
3AQ RAPGGLS YPAAS PTPHAAFLADP VSNMAMAYGSSLAAQGKE 
-VDKNIDRFIP 17KLKY YFAVDIMYVGRKLGLLFFP YLHQDWEV 
iYQQDTPVAPRFDVNA^DLYlPAMAFITYVLVAGLALGTQDRFS 
>DLLGLQASSAIAWLTLEVIAIIiLSI,YL\rrVNTDLTTIDLVAt'L 
JYKYVGMI GGVLMGLLFGKI G YYLVLGWCCVAI FVFWIRTLRLK 
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SBQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

ica xQUc O E 

amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re spond ing 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
in-Aianme, L.«i_ysceme, D=Aspartic Acid, £n 
Glutamic Acid, F^Phenyl alanine, G^Glycine, 
HcHistidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, NsAsparaglne, 
P*Proline, QsGlutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
H=Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6043 


403 


*99 


i LADAAAEG VP VRGARN QliRMYLIMAVAAAQ PMLMYWLTFH LVR 

LCLFFPFPCATPVLPLPSLlSAt.yCLS^tSVSSWFCPCQPPLPC 
PLP P LQN KTAKGSLS TEQSERG 


6044 


793 


412 


!O,J5M^FTI,ISKVKISREVTMIASKFGIG0QVRHSLIK3YIiGVV^ 

DrDPVYSLSEPSPDBLAVNDBLRAAPWYHWMEDDNGLPVHTYI, 

AEAQLSSELQDEHP\BQPSMDELAQTIRKQLOAPRLRN 


6045 


155 


2299 


SPLPQVAAMNYLRRRLSDSNFMANLPKGYMTDLQRPQPPPPPPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 
SNAVKQTTAAAAATFS EQVGGGSGGAGRGGAAS R VLLVI DEPHT 
DWAKYFKGKKIHGBIDIKVBQABFSDLNLVAHANGGFSVDMEVL 
RNG VKWRS LKPDFVL IRQHAFSMARNGDYRSL V IGLQYAG IPS 
VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPLIDQTFYPNHK 
EMLS S \TTYP VWKMGHGTLWGWGKVKVDNQHDFQDI AS WALT 
KTYATAE P F IDAKYDVRVQK 1 GQN YKA YMRTS VSGNWKTNTGSA 
MLEQIAMSDRYKLWVDTCSEIFGGLDICAVEALHGKDGRDHIIE 
WGS SMPL IGDHQDEDKQL IVELVVNKMAQALPRQRQRDASPGR 
GSHGQTPSPGALPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPGP 
QRQG P PLQQRPP PQGQQHLSG LG PPAGS PLPQR LPS PTS APQQ P 
AS QAAPPTQGQGRQSR PVAGG PG APPAARPPAS PS PQRQAGPPQ 
ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 
AGPVPRTGPPTTQQPRPSGPGPAGRPKPQLAQKPSQDVPPPATA 

AAGGPPHPQLNKSQSLTNAFNIiPEPAPPRPSLSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


1075 


EGLTG ?CER V P FLLG RG P PHGATRAGHRRAVR WAG PES L PPL P R 
SL I MDS PRAGTHQGPLDAETBVGADRCTSTAYQEQRPQ VEQVGK 
QAPLSPGLPAMGGPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCA 
FTVALRARRGADLSSLRALLGQALPHQ \AQLGQI>S YLAPGEDGH 
WVPIPEEESLQRAWQDAAACPRGLQLQCRGAGGRPVLYQWAQH 
SYSAQGPEDLGFRQGDTVDVLCE VDQAWLBGHCDGR IGI FP KCF 
WPAG PRMSGAPGRL P RS QQGDQ P 


6047 


49 


1405 


PVLVTS LRMREADTLRPPQLMEVS AD 1 1 STVEFNHTGEL LATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHBPEFDYLKSLE 
IEBKINKIKWLPQQNAAHSLLSTNDKTIKLWKITSRDKRPEGYN 
LKD3EGKLKDLS TVTSLQVP VLKPMDLMVEVS PRRI FANGHTYH 
INS I S VNS DCETYMSADDLR INLWHLAI TDRS FTP \NI VD IKPA 
NMEDLTE VITAS E FHPHHCNLFVYSS S KGSLRLCDMRAAALCDK 
KSKIiFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 
TVKVWDL \NMEARP IBTYQVHD YLRS KLCSLY ENDCI FD KFE CA 
WNGSDS V rMTGA\ YNN FFRMFDRNTKRDVTL\ EASRES S KPRAV 
LKPRRVCVGGKRRRDD I S VDS LDFTKKI LHTANHPAEN 1 1 AIAA 
TNNLY I FQDKVNSDMH 


6048 


1 


3194 


GIRTP KFCDSPTSDLEMRNGRGRGKRMR PNSNTPVNETATASDS 
KGTSNS S KTRAGANS KGRRGS QNSS BHRPPASSTS EDVKAS PS S 
ANKRKNKPLSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 
BPTVLDRNCPSPVLI DCPHPNCNKKYKHIMGLKYHQAHAHTDDD 
SJG?EADGDSEYGEEPILHADLGSCNG\ASVSQK\GSLSPARSAT 
PKVRLVEPHSPSPSSKFSTKGLCKKKLSGBGDTDLGALSNDGSD 
DGPSVMDETSND AFDS LER KCME KEKCKKP SS LKPEX I PSKSLK 
SARPI /APLAIPPQQI YTFQTATFTAASPGSSSGLTATVAQAMP 
NSPQLKPIQPKPTVKGEPFTVNPALTPAXDKKKKDKJOCKESSKE 
LESPLTPGKVCRAEEGKS PFRES SGNGMKM EGLLNGSS DPHQS R 
LAS I KAEADKI YS FTDNAPSPS IGGSSRLENTTPTQPLTPLHW 
TQ>IGAEASSVKTNSPAYSDISDAGEDGBGKVDSVKSKDAEQI»VK 
EGAiCKTLFPPQPQSICDSPYYQGPESYYSPSYAQSSPGALNPSSQ 
AGVESQALXTKRDEEPES I EGKVKND I CE E KKP ELS S S SQQPS V 
IQQRPKMYMQSLYYNQYAYVPPYGYSDQSYHTHLLSTNTAYRQQ 
YEEQaKRQSLBQQQRG VDKKAEMGLKEREAALKEEWKQ KPS I P P 
TLTKAPSLTDLVKSGPGKAKEPGADPAKSVI I PKLDDS SKLPGQ 
AP EG LKVKLSDASHLS KEAS BAKTGAE CGRQAEMDP I LWYRQEA 
BPRMWTYVYPAKYSDIKSEDERWKEERDRKLXEERSRSKDSVPK 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nticl eofe i rto 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D= A3 par tic Acid, B= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K«Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, RaArginine, 
S«Serine, T=Threonine, V=Valine, 
n ii^topnan, x=iyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








EDGXESTSbDCKLPTSEESRLGSKBPRPSVHVPVSSPLTQHOSY 
I P VMHGYS YSQS YDPNHPS YRSMPAVMMQNYPGS YLPSSYSFS P 
YG9KVSGGEDADKARASPSVTCKSSSESKAI.DILQQHASHYXSK 
SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 
o trowtuj w„ i rtttnn tinjjG YS LLPAQYKLP YAAGLSSTAIVASQQG 
STPSLYPPPRR 


6049 


215 


1089 


AMTGVF DRR VPS IRSGDFQAP FQTS AAMHH PS QESPTL PES S AT 
DSDYYSPTGGAPHGYCSPTSASYGNKALNPYQYQYHGVNGSAGS 
YPAKAYADYSYASSYHQYGGAYNRVPSAT'NQPEKEVTEPEVRMV 
NGKPKKVRK P RTI YS S FQIiAALQRRFG; KTQYLALPERAELAASL 
GLTQTQVKI W FQNKRS KI KK I M KNGEM P PEHS PS S SD PMACNS P 

QSPAVWEPQGSSRSLSHHPHAHPPTSKQSPASSYLENSASWYTS 
AASSINSHLPPPGSLQHPLALASGTLY 


6050 


566 


1*71 ft 


kglertccameesdsekttbkenlgprmdpplgepgVgslgwvl 

PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYIARDTRRLGATILD 
RIHSI^INSSLSTYSLVDSVGNTKTKDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNI FRNVE VL I YVFDVESRELEKDMHY 
YQSCLEAILQNSPDAKIFCLVHKMDLVQEDQRDLIPKBREEDLR 
RLSRPLECSCFRTSIV7DETLYKAWSSIVYQLIPNVQQLEMNLRN 
FAE 1 1 EADEVLL FERATFLV1 SHYQCKEQRDAHRFE K I SNI I KQ 
FKLSCSKLAASFQSMEVRNSNFAAFID2 FTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6051 
~0S2 


5^6 

~~ S66 


J. / ±a 


KGLERTCCAMEESDSBKTTEKBNIjG?RMDPPLGEPG\gSLGWVL 
PNTAMXKKVLLMGKSGSGKTSMRS 1 1 FANYIARDTRRLGATILD 
RIHSLQllISSLSTYSLVDSVGm'Kl-FOVBHSHVRFI^NLVLNLW 
DCX3GQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQSCLEAILQNSPDAKIFCLVHKMDLVQEDQRDLIFKEREEDLR 
RLS R PLECS CFRTS IWDETLYXAWSS IVYQL I PNVQQLEMNLRN 

FAEIIEADBVLLFERATFLVISHYQCXEQRDAHRFEKISNriKQ 
FKLSCSKLAASFQSMEVRMSNFAAFIDIFTSNTYVMVVMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 






X f La 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYIARDTRRLGATILD 
RI HSLQ I NSSTjSTYSLVDS VGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDN I FRNVEVLI YVFDVESRELEKDMHY 
YQSCLEAILQNS PDAK I PC L VHKMDLVQEDQRDL I FKEREEDLR 

RLSRPLECSCFRTSIWDETLYKAWSSIVYQLIPNVQQLEMNLRN 
FAEI IEADEVLL FERATFLVI SH YQCKEQRDAHRFEKI SNI I KQ 
FKLSCSKLAASFQSMBVJ^SNFAAFIDIFTSNTYVMVVMSDPSI 
PSAATLINIRNARKHFEKLERVtXSPKQCIiLMR 


6053 
6054 


201 


1704 


KGTEMNKSRWOSKRRHGRRSHQQNPWPRLRDSEDRSDSRAAQPA 
HDSGHGDDBSPSTSSGTAGTSSVPELPGFYFDPEKKRYFRLLPG 
HNNCKPLTKES I RQKEMES KRLRLLQEEDRRKK1ARMGFNASSM 
ijK ^WwrljriViwrCHLAHELRLSC^ 

RFNL I LADTNSDRLFTVND VTVGGS KYGI INLQSLKTPTLKVFM 
HENLYFTNR KV\NS VCWAS LNHLDSH I LLCLMGLAETPGCATLL 
PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 
TGLSRRVIiLT^^/VTGHRQSFGTNSI?VIJAQQFALMAPLLFNGCRS 
G 3 1 FA I DLRCGNQGKGW KATRL FHDS AVTS VRILQDEQYIMASD 
MAGKI KL WDLRTTKCVRQYEGH VNE YAYL PLHVHEE EG ILVAVG 

QDCYTRIWSLHDARLLRTIPSPYPASKADIPSVAFSSRLGGSRG 
APGLLKAVGQDLYCYSYS 




1 


1054 

1,: 


P r* 1 AKLOEFGTSRRHMAAPSG VH LLVRRGSHRI FS S PLNHIyOF"" 
KQS S SQQRRN FFFRRQRDISHS I VLPAAVS S AHPVPKHI KKPDY 
VTTGI VPDWGDSI EVXNEDQ I QGLHQACQLARHVLLLAGKSLKV 
DMTTEETD ALVHRE 1 1 SHNAYPS PLGYGGFPKS VCTSVNNVLCH 
3 1 PDSRPLQDGDI IN ID VIVYYNG YHGDTSETFLVGNVD3CGKK 
E*VE VARRCRDEA1AACRAGAPFS VI GNTISHITHQNGFQ VCPH F 
/GHGIGS YFHGHPEIWHHANDSDLPMEEGMAFTIEP1ITEGSPE 
PKVLEDAWTWSIiD/ TS KVSAQ FEHTVL I TSRGAQI LTKLPHEA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
c orr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sicmal nnnfi A~ — i 
{AWUanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. Phenylalanine, G-Glycine, 
H=Histidine, I»Isoleucine, K»Lysine, 
^Leucine, MoMethionine, N=Asparagine, 
P- Proline, Q=0lut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unlcnown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6055 


421 


23*4 


PPYFLLS FLAWWLYGQS DRTETD t SQSAGPPPGTLQCSALHHD P 
GCANCSRFCRDCSPPACQCHTKVPPGNALNGVQPPBLSRTLALI 
SSRSPPRKKKKSQTETGKERERTSFLTQGGKRPELQHGLAGICM 
TLLITGDS IVSAEAVWDHVIMANRELAPKAGDVIKVLDASNKDW 
WWGQIDDEEGWFPASFVRLWVNHEDBVEEGP3DVQNGHLDPNSD 
CLCLGRPLQNRDQMRANVINE 1 MSTERHYI KHLKDI CEGYLKQC 
RKRRDMFSDEQLKVIFGNIEDIYRFQMGFVRDLEKQYNNDDPHL 
S B IGPCPLEHQDG FWI YSEYCNNHLDACMELSKLMKDSRYQHFF 
EACRLLQQMI D I A\ I DGFLLTPVQKI CJCYPLQLAELLJCYTAQDH 
SDYR YVAAALAVMRNVTQQINERKRRLENI DK IAQKQAS VLDWE 
G3DILDRSSELIYTGBMAWIYQP\YGRNQQRVFFLFDHQMVLCK 
KDLI RRD IL Y YKGRI DMDKYE WD I EDGRDDDFNVSMKN AFKLH 
NKKTE2IHLFFAKKLEEKIRWLRAPREERKMVQBDEKIGPEISB 
NQKRQAAMTVRKVPKQKGVNSARSVPPSYPPPQDPLKHGQYLVP 
\DGIAQ3QVFEFTEPKRSQSPFWQNFSRLTPFKK 


6056 


43 


3358 


SGGRGPVRVRSEQLSPSABQVSQlSQISLGRRPLSSLPPPPSRA 
LAPTRAPDTAliTIMEVAEVESPIiNPSCKIMTFRPSMEBFREFNK 
YLAYMESKGAHRAGLAKVI PPKEWKPRQCYDDIDNLLI PAPIQQ 
MVTGQSGLPTQYNIQKKAMTVKEFRQLANSGKYCTPRYLDYBDL 
ERK Y WKNLTFVAP I YGAD INGS I YDEGVDEWNIARLNT VLDWE 
EE CG I S I BGVNTP YL YFGMWKTTFAWHTEDMDLYSI NYLHFGEP 
KSWYAIPPEKGKRLERLAQGFFPSSSQGCDAFLRHKMTLTSPSV 
LKKYGIPFDKITQEAGEFMITFPYGYHAGFNHGFNCAESTNFAT 
VRW IDYGKVAKLCTCRKDMVKISMDI FVRKFQPDRYQLWKQGKD 
I YT I DH T KPTP AS T P EVKA WLQRRRKVR KAS RS FQ CARS TS KR P 
KADEEEE VSDB VDGAE VPWPDS VTDDLKVSBKSEAAVKLRNTEA 
SSEEES SASRMQVEQNLSDHI KLSGNSCLSTSVTEDIKTBDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEKSDPSELSWPKSPESCS 
SVAE3NGVLTEGEESDVESHGNGLBPGEI PAVPSGERNS FKVPS 
IAEGENKTSKSWRHPLSRPPARSPMTLVKQQAPSDEELPBVLSI 

EEEVBETESWAKPLIHLWQTKPPNFAAEQEYNATVARMKPHCAI 
CTLLMP YH fCPDSSNE ENDARWETKLDE WT crn v tit t>t t -d v>ms*v> 

I YSEBNI EYS PPNAFL EEDGTS Lit I SCAKCCVRVHAS CYG I PSH 
EICDGWLCARCKRNAMTABCCLCNLRGGAIJCQTKNNKWAHVMCA 
VAVPEVRFTNVPE RTQ I D VGR I PLQRLKLKCI FCRHRVKRVS G A 
CIQCS YGRCPAS FHVTCAEAAGVL\MEPDDWP YWNI TCFRHKV 
NPNVKSKACEKVI S VGQTV ITKHRNTRY YSCRVMAVTSQTFYE V 
MFDDGSFSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLYG 
AKYFGSNIAHMYQVEFEDGSQIAMKREDIYTLDEELPKRVKARF 
VSAGRCHLGTCQVNS LS S PHVS QAQQET YLGFWINS KXSQCNX F 
LSGTY 


60S7 


1 


853 


FVARLKEQEGEGGLGPRKEKGRARGREKRRKMQLTRCCFVFLVQ - 
GSLYLVICGQDDGPPGSEDPBRDDHEGQPRPRVPRKRGHISPKS 
RPMANSTLLGLLAPPGEAWG I LGQPPNRPNHSPPPSAKVKXI FG 
WGDFYSNIKTVALNLLVTGKIVDHGNGTFSVHFQHNATGQGNIS 

islvppskavefhqeqqifieakaskifncvrmewekvb\rgrr 

TSLFTHDPAKI CS RDHAQS SATWS CSQP FKVVCVY I AF YSTD YR 
LVQKVCPDYNYHSDTPYYPSG 


6058 
6055 " 


1 


986 


hplpsaslglpsvslgvslcvrsalleawpmlpkrrrarvgsp 
sgdaasstppstrfpgvaiylvbprmgrsrrafltglarskgfr 
vldacsseathwmeetsaeeavswqerrmaaappgctppalld 
iswlteslgagqpvpvecrhrlevagpskgplspawmpayacqr 
ptplthhntglsealeilaeaagfegsegrlltfcraasvlkal 
pspvttlsqlqglphfgehssrwqellehgvceevervrrsb/ 
rlftqifgvgvktadrwyreglrtlddlreqpqkltqqqkagep 
sreagpwaslnctldpsastp 




2 


3650 

] 


qqdfssladltdhrahrcpgdgdddpqlswvasspsskdvaspt 

DMIGDGCDIjGLGEEEGGTGLPYPCQFCDKSFIRLSYLKRHEQIH 

SDKL? fkctycsrlf kh krsrdrhi klhtgdkkyhche ceaafs 

RSDHLKIHL KTHSSS KPFKCTVCKRGFSSTS S LQSHMQAHKKNK 
SHlAKSEKEAiOCDDFMCDYCBDTFSQTEELEJCHVLTRHPQIiSEK 
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SEQ 
ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of. 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of • 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(JUAlanine, C= Cysteine, D=Aspartic Acid, E«=» 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
HaHistidine, Iolsoleucine, K=Lysine, 
L=>Leucine, MaMethionine, NaAsparagine, 
P=Proline, QoGlutamine, R^Arginine, 
SoSerine, T=Threonine, V*Valine, 
W-Tryptophan, Y=Tyrosine, Xt=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQCIKCPEVPVDENTLLAH IHQAHANQKHXCPMCPE \QFSSV 
\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 
BRGSTPDSTLKPLRGQKKMRDDGQGWTKWYSCPYCSKKDFNSL 
AVLEIHLKT1HADKPQQSHTCQ1CLDSMPTLYNLNEHVRKLHKN 
HAYPVMQ FGNI SAFHCNYCPEM FAD I NS LQEHIRVSHCGPNANP 
SDGNNAFFCNQCSKGFLTESSLTEHIQ\Q\AHCSVGSAKLESPV 
VQPTQS FME VYS CP YCTNS P I FGS I L KLT KH I KENHKNI PLAHS 
KKSKAEQSPVSSDVEVSSPKRQRLSASANSISNGEYPCNQCDtiK 
FSNFESFQTHLXLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
YMTTSTHYVCESCDKQFSSVDD\LQKI!\LLDMPHPIiCCTHCT\L 
CQEVFDS\ KVS I \QVHLAVKHSNE KXMYRCTACNWDFRKEADIiQ 
VHVTCHSHLGNPAKAHKCIFCX3BTF5TEVBLQCHITTHSKKYNCK 
FCSKAFHAI I LLE KHLRE KHCVFDAATENGTANG VP P MATKKAE 
PADtiQGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYTME 
VLLONHRLRDHN I RP GE D DG SRK KAE FI KGSHKCNVCS RTFFS E 
NGLREHLQTHRGPAKHYMCPICGERFPSLLTLTEHKVTHSKSLD 
1GTCRICKMPLQSEEEFIEHOQMHPDLRNSLTGFRCWCMQTVT 
STLELKIHGTFHMQKLAGSSAASS PNGQGLQKLYKCALCLKBFR 
SKQDLVKLDVNGLPYGIiCAGC3MARSANGQA/GG1jAPPEPADRPCA 
GLRCPE CSVKFESAEDLB SHMQVDHRD LTPE TSGPRKGTQTS PV 
PRKKTYQC1KCQMTFENBREIQIHVANHMIEEGINHECKLCNQM 
FDSPAKLLCHIiI EHSFBGMGGTFKCPVCFTVFVQANKLQQHI FA 
VHGQEDKIYDCSQCPQKFFFQTELQNHTMSQHAQ 


6060 


2145 


202 


S Y E I VG KNKI>E VNHSQL KALCKCSLPS RLLP LGEN LPLLDRG FR"~" 
KEPRSRGSRERDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 
DI SASRPNILLLMADDI.G IGD IGC YGNNTWRTPNT DRLABDG VK 
LTCHISAASLCTPSRAAFLTGRYPVRSGMVSSIGYRVLQWTGAS 
GGLPTNETTFAKILEBKGYATGLIGKWHLGLNCESASDHCHHPL 
KHGFDH FYGMP FSLMGD CAR WELSE KR VNLEQ KLN FLFQVLA L»V 
ALTLVAGKLTHL IPVSWMPVI WSALSAVLI.LASS YFVGALIVHA 
DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLFV 
S FLHVH X PLITMENFLGKSLHGLYGDNVKEMDWMVGRILDTLDV 
EGLSNS TLI Y PTS DHGGSLENQ LGNTQYGGWNGI YKGGKGMGGW 
EGGlRVPGIFRWPGVIiPAGRVIGEPTSLMDVFPTWRIiAGSEVP 
QDRVI DGQDLLPL LLGTAQ HS DHEFLMHYCERFLHAARWHQRDR 
GT^MKVHF\^rPVFQ?EGAGACYGRKVCPCFGEKVVHHDPPLLFD 
LSRD PSETHI LTPAS E PVF YQVMEIR \ VQQAV WEHQRTLSPVPLQ 
LDRLGNIWRPWLQPCCGPFPLCWCLREDDPQ 


6061 


110 


1330 


^IWIHMK&lCT^^fl^TNTyt^PMT"M^^Sr'MPA^^p'x?v»ppT T PeEwcnM ~ 
i u* j,npiJVK<wA AAnin L r £^KMJj£u.1_»1A^I v 1FAv KvK.1 hiXiXiCio E\JGS PN 

VHNY PDMEAVPL LLNNVKG EPPEDSLS VDHFQTQTE P VDLS INK 
ART6PTAVS SSP VSMTASASSPS5TSTSSSSSSRLASS PTVI TS 
VSSASSSSTVLTPGPLVASASGVGGQQFIiH IIHPVP PSSPMNLQ 
SNKLSHVHR IPVWQ3VPVVYTATOSPGNVNNTIVVPLLEDGRG 
HGKAQMDPRGLS PRQSKSDSDDDDLPNVTLDS VNETGSTALS IA 
RAVQEVHPSPVSRVRGNRWNNQKFPCSISPFSIESTRRQRTVLM 
PPDSRKTAYSTDCDF\EGLQQKLYTKSSSPGRVHRRTHTGEKPY 

1ALHRRRHMLV 


6062 


71 


1079 


ETMAKNGPENCSDCHILNAEAPXSKKICKSLKICGLVFGILALT 
LIVIiFWGSKHFWPEVPKKAYDMEHTFYSNGEXKKIYMEIDPVTR 
TEIFRSGNGTDETLEVHDFKNGYTGIYFVGLQKCFIKTQTKVIP 
E FSEPEE E I DENEE I TTTFFBQSVI WVP AEKP I ENRDFLKNS KJ 
LEICDNVTMYT^\INPTL\ISGTFAKQLHHNFAFIILVSBLQDFE 
EEGE DLHFPANEKKG I EQNEQWWPQ VKVE KTRHARQASEEELP 
INDYTENGIEFDPMtiDERGYCCIYCRRGNRYCRRVCEPLIjGYYP 
YPYCYQGGRVI CRVIMPCNWW VARMLGRV 


£.063 


71 


1079 


ETMAKNGPENCEDCHILNAEAFKSKKICKSLKICGLVFGILALT " 
LIVLFWGSKH FWPEVPKKAYDMEHT FYS NGE KKKI YM E I D? VTR 
TE I FRSGNGTDETLBVHDFKNG YTG I YFVGLQKCFI KTQ I KVI P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNTSKI 
LEICDNVTMYW\INPTL\ I SGTFAKQLHHNFAF IILVS ELQDFE 
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SEQ 
ID 
NO: 


Predicted 

h^wl nm'nrr 
*Jtiy ± ill ixi ly 

nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end - 
nucleotide 
location 
corresponding 
co Eirst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=* Lysine, 
L=Leucine, [^Methionine, N=Asparagine, 
P» Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine f V»Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEGEDLHFPANEKKQIEQNEQWVVPQVKVEKTRHARQASEBELP 
INDYTENGIBFDPMIiDERGyCCIYCRRGNRYCRRVCEPLLGYYP 
YP YCYQGGR VI CR VI M PCN ft WVARMLGR V 


6064 


913 




NLPQSLPRPTEHSPPYSLEKMTDLVAVWDVALSDGVHKIEPEHG 
TTSGKRVVYVDGKBE IRKEWMFKLVGKETFYVGAAKTKATINID 
AISGFAYEYTLEINGKSLKKYMEDRSKTTNTW\njHMDGENFRIV 
LE KDAMD VWCNGKKLETAGE FVDDGTETHFS IGTH\ACYIKAV\ 
SSG \ KRKEG I IHTLIVDNRE IPE IAS 


6065 


1153 


641 


MS VRVARVan VRGLGAS YRRGAS SFP VPPPGAQGVAEL'fjRDATG 
AEEEAPWAATERRMPGQCSVLLFPGOGSQWGMGRGLLNYPRVR 
BLYAAARRVLGYDLLELSLHGPQETLDRTVHCQPAIFVASLAAV 
E ECLHHLQ P S VI ENCVAAAGFS VGEFAAL VFAG AMEFAEG 


6066 


68 


3470 


VKENMPATRKPMRYGHTEGHTBVCFDDSGSFIVTCGSDGbVRIW 
BDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 
VPDGI LTRFTTNANHVVFNGDGTKIAAGSSD\ FLVKI VDVMDSS 
QQKTFRGHDAP VLS LS FDPKD I FLASASCDGSVRVWQ IS DQTCA 
I S WPIiLQKCN DVTNAKS I CRLAWQP KS3KLLAI PVEKSVKL YRR 
ES WSHQFDLSDNFISQTLNI VTWS PCGQ YLAAGS INGLI I VWNV 
ETiCDCMERVKHEKGYAICGLAWHPTCGRISYTDAEGNLGLLENV 
CDPSG XTSS S KVSSRVEKDYNDIiFDGDDMSNAGDFLNDNAVEI P 
SFSKG I INDDEDDEDLMMASGRPRQRSHILEDDENSVDISMLCT 
GSSLLKBEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 
GSTPLHLTHRFMVWNS IG 1 1 RCYNDEQDNA I DVE FHDTS IHHAT 
HLSNTLlOTIADLSHEAILLACrESTDELASKIJICLHFSSWDSSK 
EWI IDLPQNEDI EAICLGQG WAAAATS ALLLRL FT IGGVQKE VF 
SLAG P VVSMAGFTGEQL FI VYHRGTGFDGDQCLG VQLLE LGKKKK 
QILHGDPLPLTRKSYIAWIGFSAEGTPCYVDSBGIVRMLNRGLG 
NTWTPI CNTREHCKGKSDHYWVVG IHENPQQLRCI PCKGSRFP P 
TLPRPAVAILS FKLPYCQ IATEKGQMEEQFWRS VI FHNHLDYLA 
KNGYEYBESTKNQATKEQQELLMKMLALSCKLBREFRCVELADL 
MTQN A VNLA I K YAS RS R KL I LAQ KLS E LAVE KAA E 7 iT ATQ VE E E 
EEEE DFRKKXiJNAG YSNTATE WSQPR FRNQ VEEDAEDSGEADD E E 
KPEIHKPGQNSFSKSTNSSDVSAKSGAVTFSSQGRVNPFKVSAS 
SKEPAMSMNSARSTNILDNMGKSSKKSTAJ^RTTNNEKSPI IKP 
LIPKP KPXQASAAS YFQKRNSQTNKTEEVKEENLKNVLSETPAI 
CPPQNTENQRP KTGFQ MWLEENRSN I LSDN PDFSDEADI I KEGM 
IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRVVDESDETEW 
QEEKAKENLNLSKKQKPLD FSTNQKLS AFAFKQE 


6067 


858 


321 


LP WQRLGVL LSRGKMAVTGWItES LRT AQ KTALLQDGRRICVHYL F 
PDGKEMAEE Y DEKTS ELLVRKWRVKSALG AMGQWQLE VGDPAPL 
GAGNLGPELIKESNANPIFMRKDTKMSFQWRIRNLPYPXDVYSV 
SVDQKERCI IVRTTNKKYYKKFS IPDL0RHQLPLDDALLSFA\T 
PTAP 


6068 


13 


1730 


GSKMADLANEEKPAIAPPVFVFQKDKGQKSPAEQKWLSDSGEEP 
RGEAEAPHHGTGHPESAGBHALEPPAPAGASASTPPPPAPBAQL 
P P FPRELAGRSAGGSS PEGGRDSDREDGN YC PPVKRERTS SLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
PAVLQAPQPKALSQTVPSSGTNGVS LPADCTGAVPAASPDTAAW 
RSP S BAADE VCALEEKEPQKNESSNASEB EACEKKDPATQQAFV 
FGQNLRDRVKLII^SVDEADMENAGHPSADTPTATNYFLQYISS 
SLEN3TNSADASSNKFVFGQNMSERVLSP pklnevssdankena 
AAESGSES S S QEATPEKES LABS AAA YTKATARKCLLE KVEVIT 
GEEAESlJVLQI'lC^KLFVFDKTSQSVrVERGRGliLRIiNDMASTDDG 
TLQSRLSDAGPRSSLR\LILNTKLKAQMQIDKASEK\SIRITAM 
DNEDQGVfCVFLISASSKDTGQVYAALHHRILALRSRVEQEQEAX 
yjPAPEPGAAPSNEEDDSDDDDVIAPSGATAAGAGDEGDGQTTGS 


6069 


583* 


27 


PTRPGQAGSSsAMAAQRLGKRVLSKLQSPSRARGPGGSPGGLOK 
RKARVT VKYDPJ^LQRRLDVEKWIDGRLEELYRGMEADMPDE IN 
I DELLBLESEEERSRKI QGLLKS CX5KP VEDFI QELLAKLQGLHR 
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5py% — 

ID 
NO: 


rreaiccea 
beginning 

UULicUL IQb 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


P real c tea end 
nucleotide 
location 
corresponding 
to first 

Ami' r» cv an 1 ! ^ 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, G=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F» Phenylalanine, G«Glycine, 
H«Histidine, 1=1 s ©leucine, K=»Lysine, 
LoLeucine, M=Methionine, N=*Asparagine, 
poProline, Q=G1 ut amine , RaArginine, 
SoSerine, T=Threonine, V= Valine, 
W=Txyptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\spoasible nucleotide insertion) 








Q\PGLRQPSPS?\DGQPSAPPQGPGARTASPLTI,LALFPGPPER 
RPALLCVLSCI 


" 6070 


478 


85B 


irvtvdgeflhyifplqfldspew/rjtbthrgrhf\qvtltae " 

TDCRYVSWRRKXI>YLL FAQHRY I S RLFS VLIGSD I ADKL YALND 
RVY1 GKR YKYD I RLPNFYQ MSTPE I RRS PLTQH FQNSRRYW 


6071 


2 


1654 


HFJ^TKGNMAIJU^P\VRLFSLVTRJLI*LAPRRGLTVRSPDEPLPV 
VR1 P VALQRQLEQRQSRRRNLPR P VLVRPGP LLVS ARR PELNQP 
ARLTLGRWBRA PLASQGWKS RRARRDHFS I ERAQQEAPAVRKLS 
SKGS FADLGAWKPRVLHALQE\AAP E WQ\ PTTVQS ST I PS LLR 
GRHWCAAETGSGKTLSYLLPLLQRLLGXHPSLDStiPrPAPRGL 
VLVP SR3IAQQVRAVAQP LGRSLGLLVRDTjEGGHGMRRIRLQLS 
RQPS ADVLVATPGALWKALKSRLI SLEQLSFLVLDEADTLLD2S 
FLELVDYT LE KSHI AEGPADLEDP FNP KAQLVLVGATFPEGVGQ 
LLNKVAS PDAVTTI TSS KLHCIMPHVKQ TFLRLKGADKVAELVH 
ILKHRDRAERTGPSGTVLVFCNSSSTVNWLGYILDDHIQQRLRlj 
QGQMPALMRVGIFQSFQKSSRDILLCTDIASRGLDSTGVELWN 
YDFPPTLQDYIHRAGRVGRVGSBVPGTVISFVTHPWDVSLVQKI 
ELAARRRRSLPGLASSVKEPLPQAT 


6072 


1 


742 


kmertemmptinsqlefkskpfplvsssrwlvkr^eL^Ayvedt 

VLFSRRTSKQQVYFFLPMDVL1 ITKKKSEES YNVNDYSLRDQLL 

VESCDNEELNSS pgknsstmlysrqssashlptltvlsnhanek 

VEM L LGAETQS ERARWI TALGHSSGKPPADRTSLTQVE 1 VR S FT 
AKQPDELSLQVADWIiI\YQRVSDGWYEGER\liRDGERGKFPME 
CAKEITCQATIDKNVERMGRLLGLETNV 


6073 


620 


360 


PCRRGLAR PliS RRPG/ S I LVHCAVG VS RS ATLVLA YLML YHHLT " 
LVEAI KECVKDHRG 1 1 PNKGFLRQIiLALDR RLRQG LEA 


6074 


168 


1110 


PGARCMATELQCPDSMPCHNQQVNSASTPSPEQLRPGDLILDHA 
GGNRASRAKVILLTGYAHSSLPAEXDSGACGGSSLNSEGNSGSG 
DSSSYDAPAGNSFLEDCELSRQIGAQLKLLPMNDQ1RBLQTIIR 
DKTASRGDFMFSADRLIRLWEEGIjNQIjPYKECMVTTPTGYKYE 
OVKFEKGNCGVS1MRSGEAMKQGLRDCCRSIR1GKILIQSDEET 
QRAKVYYAKFPPDIYRRKVt»LMYPILQTG\NTVIEAVKVLIEHG 
VQPSVI ILLSLFS7PHGAKS I IQEF PKI T I LTTBVHPVAPTHFG 
QKYFGTD 


6075 


320 


1091 


P?TCQPQEVEHH\YGYVPILGNKTLPSRCHQCVIVSSSSHIiLGT 
KLGPEIERABCTIRMNDAPTTGYSADVGNKTTYRWAHSSVFRV 
LRRPQEFVNRTPETVFrFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVS PGRMRQFDDLFRGETGKDRE KSHS WLSTG WFTMVI A 
VELCDHVHVYGMVPPNYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGNHHRFI TEKRVPSS WAQLYG I TFSHPSWT 


6076 


1721 


107 


H PSPTEAPRVQHLTMDCTWR I LFLVAAATGTHAQ VQLVQS GAE V 
KKPGAS VKVS CKVSG YTLTE LSMHWVRQAPGKGLE WMGAFDPED 
GETIYAQKFQGRVTMTEDTSTDTAYMELSSLRSEDTAVYYCATD 
HGDYAFDI WGQGTMVTVSSAPTKAPDVF P 1 1 SGCRHP KDNS P W 
LACIiITOYH?TSV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 
SQLSTPLQQWRQGBYKCWQHTASKSKKEIFRWPESPKAQASSV 
PTAQPQAEGSLAKATTAPATTRNTGRGGEEKKKEKBKEEQEERE 
TKTPECPSHl'QPLGVYLLTPAVQDLWLRDKATFTCPVVGSDLKD 
AHLTWEVAGKVPTGGVEEGLLERHSNGSQSQHSRLTLPRSLWNA 
GTSVTCTLNHPSLPPQRLMALREPAAQAPVKLSLNLLAS SDPPE 
A\ ASWLL CE VSGPSPPN I LLMWLEDHGEVNTSGFAPAR PLPKP \ 
RSTT FWA\ WS VLRVPAP PS PQ PATYTCWSHEDSRTLLNASRS L 
EVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNIiQPI FW IGLISSVCCVFAQTDENRCLKANAKS CGECIQ 
AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 
GSKDIKKNXNVTNRSKGTAEKLKPEDITQIQPQQLVLRLRSGBP 
QTFTLKFKRAEDYPI DLYYLM\DLS YSMKDDLENVKSLGTDLMN 
EMRRI TSDFR IGFGS F VEKTVMPYI STTPAK.LRNPCTS EQNCTS 
PFSYKNVLSLTNKGBVFNELVGKQRISGNLDSPEGGFDAIMQVA 
VCGSLIGWRm^TRLLVFSTT)AGFHFAGDGKlX?GrVLPNDGQCHL 
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SEQ 
ID 
NO: 


Predicted 
beg i nning 
nucleot lde 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signai peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E* 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ENNMYTMSHYYDYPSIAHLVQKLSENNIQTIFAVTEEFQPVYKB 
LKNLIPKSAVQTLSANSSNVIQLlIDAyNSLSSEVILBNGKLSB 
GVTISYQSY\CKNGVNGTGBNGRKCSWISIGDEVQFEISITSNK 
CPKXDSDSFKIRPLGFTBEVEVI LQYI CECECQSEGI PESPKCH 
BGNGTFECX3ACRCNEGRVGRHCECSTDEVNSBDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNBIYSGKFCECDNFNCDRS 
NGLICGGNGVCKCRVCECNPNYTGSACDCSLDTSTCBASNGQIC 
NGRGICECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA 
PNKGBKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYPTYSVNGNNEVMVHWENPECPTGPDI IP IVAG WAG 
I Vb I GLALLLI WKLLM I IHDRREFAKFBKEKMNAKWDTGENP I Y 
KSAVTTWNPKYEGK 


6078 


1426 


180 


BTEDVMELLEEDLTCPICCSLFDDPRVLPCSHNFCKKCLEGIIiE 
GS VRNSLWRPVPFKCPTCRKKTFS YWELI PLQVNYSLKGI VEKY 
WKIXISPKMPVCKGH\LGQPLNIF\CL\TDMQLDL/CGIC\ATR 
GEHTKHVFCSIEDAYAQERDAFESLFQSFETWRRGDALSRLDTIi 
ETSKRKSLQIiLTKDSBKVKEFFEKLQHTLDQKKNEILSDFETMK 
LAVMQAYDPEINKLNTI LQEQRMAPSI ABAFKDVSEPI VFLQQM 
QEFREKI KVI KSTPLPPSNLPASPLMKNPDTSQWEDI KLVDVDK 
LSLPQ DTGTFI S Kl PWS FYKL FLL I LLLGL VI VFGPTM FLE WS L 

FDDIATWKGCLSNFSSYLTKTADFIEQSVFYWEQVTDGFFIFNE 
RFKNFTLWLNNVAEFVCKYKLL 


S079 


1586 


141 


ATARDLGCARRIDRWMESTPSRGLNRVHLQCRNLQBFLGGIiSP " 
GVLDRLYGHPATCLAVFRELPSLAKNWVMRNLFLEQPLPQAAVA 
LWVKKEFSKAQEESTGLLSGLRIWHTQLLPGGLQGLILNPIFRQ 
NLRIAuUK3GKAWSDDTSQLGPDKHARE}VPSLDKYAEERWEWL 
H PMVGS PSAAVSQDLAQLLSQAGLMKSTEPGE PPCI TS AGFQ FL 
LLDTPAQLW Y FMLQYLQTAQSRGMDLVEIL5 FLFQ LSFS TLGKD 
YS VEGMS DSLLNFLQHLRE FGLVFQRKRKSRRYY PT /RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTBSELQIALIALFSE 
ML* PFP\NMW\ARVTR\ESVQQAIASGITAQQI IHFLRTRAHP 
VKLKQTPVLPPTITDQI^LWEIiERDRLRFTEGVLYNQFLSQVDF 
ELL \ LAHAPKLG VL VFE / NTPAKRLMWTPAGH S DVKR FWKRQK 
HSS 


6080 


1 


1199 


ibtidhvgefamaaqaagvsrqraatqglgs^qnalkyi^qdfk" 

TLRQQ CLDSGVL FKDP EFPACPS ALG YKDLG PG SPQTQG 1 1 WKR 

ptelcpspqfivggatrtdicqgglgdchllaaiasltlnebll 

YRWPRDQDFQENYAGI FHFQPLCPPS ?\FWQYGEWVEWIDDR 
LPTKNGQLLFLHSEQGNEFWSALLEKAYAKLNGOnSALAGGSTV 
EGFBDFTGGISEFYDLKKPPANLYQI IRKALCAGS LLGCSIDVY 
SAAEAEAI TSQKLVTCSHAYS V1X3VEEVNFQGHPEKLIRLRNPWG 
EVEKSGAWSDDAPEWNHIDPRRKEELDXFCVEDGBFWMSLSDFVR 
y * bKLE I CNLSPDSLSSEE VHKWNLVLFNGHWTRGSrAGGCGNY 
PGSS 


6061 


3 


865 


EMLPLLLPLPLLWA/GALAQDARFRLEMPESVTVQEGLCI FVHC 
S VFYLE YG WKDS TPAYGHW FREG VS VDQETPVATNNSTQKVQKE 
TQGRFHLLGDPSRNNCSLS IRDARRRDNGS YFFWVARGRTKFS Y 
KYSPLS VYVTALTHRP DI L I PB FLKSGHPSNLTCS VP WVCEQGT 
P P I FS WMSAAPTS LGPRTLHS S VLT1 1 PRPQDHGTNI* I CQVT FP 
GAGVTTERTIQLS VS WKSGTVEE VWLAVGWAVKl LLLCLCL I 
ILSFHKKKAVRAVEVEENVYAVMG 


6082 
6083 


283 
1865 


1288 
309 


earspgptqtrtapglaapglaqpaalrlllsrppsaamdgdgd 
pesvgqpeeaspeeqpeeasaeeerpedqqeeeaaaaa\y\lde 
l ?eplla/lrvlaalprhe\lvqacr\lvclrwkelvdgaplwl 
lkcqqeglvpeggveeerdhwqqfyflskrrrnllrnpcgeedl 
egwcdvehggdg wrveelpgdsgve fthdes vkkyfass fewcr 

KAQVTDLQAEGYWEELIJDTTQPAIVVKDWYSGRSDAGCIjYELTV 
1CLLSEHBNVLAEFS SGQVAVPQDSDGGG WME I SHTFTD YGPGVR 
FVRFEHGGQDSVYWKGWFGARVTNSSVWVEP 
KQWCAERRGIX3MSIJU)ELLADLEEAAEEBEGGSYGEEEEEPAJE " 



442 



WO 01/53312 



PCT/US00/34263 



SEQ 
NO: 


) Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

. amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sicrnal Deptic^ 
(A=Alanine, CoCysteine, D=Aspartic Acid, E~ 
Glutamic Acid, F=Phenylalanina, G-Glycine, 
H=Histidine, I^Isoleucine, K=*Lyaine, 
L=Leucine, M=Metfaionine, N«Asparagine, 
P-Proline, Q»Glut amine, R=Arginine, 
S«Serine, T= Threonine, V» Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=*stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 


j 6084 






DVQBETQLDLSGDSVKTIAK^W&SkMFAEIMMKlEHYISKQAXA"" 
SEVMGPVEAAPBYRVIVDANNLTVEIENELITIIHKPIRDKYSKR 
FPBLES LVPNALDYI RTVXELGN5LDKCKNNENLQQILTNATIM 

VVSVTASTTQGQQLSBSBLERLEEACDMALBLNASiCHRIYEYVE 
SRKSFIAPrTLSIIIGASTAAKIMGVAGGLTNLSKMPACNIMLLG 
AQRKTLSGPSSTSVLPHTGYIYHSDIVQSLPP1PPPPSVAP\DL 
RRKAARLVAAKCf LAARVDS FHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAP LDGQR KKRGGRRYRKMKERLGLTE IR\ KQ 
ANRMSFGEXEEDAYQEDLGPSLGHU3KSGSGRVRQTQVNEATKA 
R I SKTLQRT LQ KQS WYGGKSTI RDRSSGTAS S VAFT PLQGL B I 
VNPQAAEKKVAEANQKYPSSMAEPLKVKGEKSGLMST 


6085 


1865 


309 


KQWCAERRtiLGMSLADELLADLEBAAEEBEGGSYGEBEEEPAIB " 
D VQE ETQLDLSGDS VKTI AKLWDS KMFAE I MM KI EE YI SKQAKA 
S EVMG PVEAAPE YRVI VDANNLTVE I ENELNI THVPTPmrvQirD 

FPELESLVPNALDYIRTVKELGNSLDKCKNNBNLQQILTNATIM 
WSVTASTTQGQQLSEEEliERLEEACDMALBLNASKHRIYEYVE 
SRMS FI APNLS 1 1 IGASTAAKI MGVAGGLTNLSKMPACNIMLLG 
AQRKTLSGFSSTSVLPHTGY1YHSDIVQSLPPIPPPFSVAP\DL 
RRKAARIjVAAKCTLAARVDS FHESTEGKVGYELKDEIERKFDKW 
QE P PP VKQVK PL PAP LDGQRKKRGGRRYRKM KERLGLTE I R\KQ 
ANRMSFGEI EEDAYQEDLG FSLGHLGXSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGLEI 
VNPQAAEKKVAEANQKYFSSMAEFLJCVKGBKSGLMST 




2 


14S6 


SGPRSFQGNRAVGRISLGGXRMPEVTLLPGVSSERVRRWRRARV" 

GVARVKPGNPWKPSPATQVPR/VPAQVYLPGRGPPLRBGEELVM 

DEBAYVLYKRAQTGAPCLSPDIVRDHLGDNRTELPLTLYLCAGT 

QAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEBR 

KPQLELAMVPHYGGINRVRVSWLGEEPVAGVWSEKGQVEVPALR 

RLLQWSBPQALAAFLRDEQAQMKP I FSPAGHMGEGFALDWS PR 

VTGRLLTGDCQKNIHLWTPTDGGSWHVDQRPFVGHTRSVEDLQW 

SPTENTVFASCSADASI R I WDIRAAPSKACMLTTATAHDGDVNV 

IS WSRRE P FLLS GGDDGALKI WDLRQ FKSGS PVATFKQHVAPVT 

SVEWHPQDSGVFAASGADHQITQWDLG/IVERDPEAGDVBAD^G 

LADLPQQLLFVHQGETELKELHWHPQCPGLLVSTALSGFTIFRT 
ISV 


J 6086 
"6087 * 


2419 " 


1357 


GAATQHGGAI^IJUPCNPHGNGLLYAGFNQDHGCFACG^BKGFRV 
YNTDPLKEKEKQEFLEGGVGHVEMLFRCNyLALVGGGKKFKYPP 
NKVMIWDDLKKKTVrEIEFSTEVKAVKLRR\DKIWVLDSMIKV 

ptftknp\hqlhvpe\tcynpkglc^l<:pnsnnsllafpgthtg 

HVQLVDLASTEKP PVDI PAHEGVLSCl ALNLQGTRIATASEKGT 
L IRIFDTSSGHLI QELRRGSQAANIYCINFNQDASLI CVSSDHG 

tvhipaabdpkrnkqsslasasflpkyfsskwsfskfqvpsgsp 

C 1CAFGTEPNAVIAI CADGSYYKPLFNPXGECIRDVYAQPLEMT 


6088 


476 


1877 


UNSQRTGLPlTIFSRSFPIiTG^DLeENMPCTCTWRNWRQWIRP 
LVAVI YLVS I WAVPLCVWBLQKLE VG IHTKAWFI AG I FLLLT I 
PIS LWVI LQHLVHYTQ PELQKP 1 1 RI LWM VP I YSLDS WI ALKYP 
GIAIYVDTCRECYEAYVIYNFMGFLTNYLTNRYPNLVLILEAKD 
QQiGiFPPLCCCPPWAWGEVLLFRCECLGVLQYTVVRPFTTIVALI 
CELLG I YDEGNFS FSNAWTYL VI I N NMS QL FAMY CLLLPYKVLK 
EEI^PIQPVGKFLCVXLVVFVSFWQAWIALLVKVGVISEKHTW 
E WQTVEAVATGLQDFI I CI EMFLAAIA\ HHYTFS YKP YVQEAEE 
JOV,r u "* tr wl/ vox/ j. Kuuia t, y vRHVGR TVRGHPRKKLPPBDQ 
DQNEHTS LLSSSSQDAI S LAS SMPPSPMGHYQGFGHTVTPQTtP 
rTAKISDEILSDTIGEKKEPSDKSVDS 




1684 


689 ' < 
J 
] 

c 

1 1 


^ASGLVRLLQQGHRCLLAPVAP KLVP P VRU V KKGFR AAFR FQKE " 
jERQRLLRCP PPPVRRS EKPNWDYHAE I QAFGHRLQ ENFS LULL 
<TA?VNS CYIKS EEAKRQQLG IEKEAVLLNLKSNQELSEQGTSF 
5QTCLTQPLEDEYPDMPTEGIKNLVDFXTGEEWCHVARNLAVE 
JLTLS S B FP VP PAVLQQTPPAVI GALLQSSG PERTALF IRDFL I 
?QM'iX5K£LFEMWOlNPM3LiLVEELKKRNVSAPESRLTRQSG\A 
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ID 
NO: 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
se<juence 


s»cy»»«=*A»» tujiuaining signaj. peptide 
(AsAlanine, C«Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid. F=Phenvl alanine r,-r.ivr.<« a 
H-Hictidine, Iolsoleucine, K=Lysine, 
Leucine, M*Methionine, N*Asparagine, 
P^Proline, Q«Glut amine, R=Arglnine, 
S=Serine, T-Threonine, v=Valine, 
N=Tryptophan, Y=Tyrosine, XoUnknovn, *=stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








PTALPLYFVGLYCDKKLIAEGPGETVIjVAE EIEAARVALRKLygF 
TENRRPWNYSKPKETLRAEKSITAS 


6089 


3 


3054 


TRLGIPGSTISSRPIU^au^EOTFl^HSWrGSI^AhTGAPAW - 
PSRRLRDLPAGGMWRLRRAAVACEVCQSLVKHSSGIKGSLPLQK 
LHLVSRSIYHSHHPTLXLQRPQLRTSFQQFSSLTNLPLRKLKFS 
PIKYGyQPRRNFMPARIATRLlJCLRYLILGSAVGGGYTAXKTFD 
QWKDMI PDLS E YKWI VPDI VWEI DE YIDFE K I RKALPS S EDLVK 

LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDXHFRK 
VS DKEKIDQLQEELLHTQ LKYQR I LERLB KENKELR KLVIjQKDD 
KG 1 P F I ES LRKSliI DMYS EVLDVLS D YDASYNTQDHLPRVVWG 
DQSAGKTSVLEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 
LFKDSSREFDLTKEEDIJXAIiRHEIELRMRKNVXEGCTVSPETIS 
LNVKGPGLQRMVLVDIJGVItn'VTSGMAPDTKETI FS ISKAYMQ 
DPNAIILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 
AE KNVAS PS R I QQ 1 1 EG KL FPM KALGV FAWTGKGNS SES I EAI 

reyeeeffqnski^ktsmlka>:qvttrnlslavsdcfwkmvres 

vcyvAUi tr ka l KrNIjE - EWKNN Y PRtiRELDRNEL FEKAKNE ILD 

EVISLSQVTPKHWEEILQQSLKERVSTHVIENIYLPAAQTMNSG 

TFNTTVDIKLXQWTDKQLPNKAVEVAWETLQEEFSRFMTEPKGK 

EHDDIFDKLKEAVKEBSIKRHKWNDFAEDSLRVIQHNALEDRSI 

SDKQQWDAAIYFMEEALQARLKDTENAIENMVGPD\WKKRWLYW 

KNRTQEQCVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 

SRGVE VDPS h I KDTWHQVYRRHFL KTALNHCNliCRRG F Y YYQRH 

FVDSELECNDWLFWRIQRMLAITANTLRQQLTNTEVRRLEKNV 

KEVLEDFAEDGEKKIKIiLTGKRVQLAEDLKKVREIQEKLDAFIE 
ALHQEK 


6090 


194 


1560 


PVFVPAPGAVLEQAS/ASPPIATQTVVPLQHCKIPELPVQASIL 
c a, uyur r v*UJ-> ±au r VH Y I N X YKTVWWYP PSHP PSHTSLNFHIiID 
FNLLMVTTIVLGRRFIGS I VKEAS QRGKVSLFRS I LLFLTR FTV 
LTATGWSLCR5L1HLFRTYSFLNLI./FPLLSVWDVHSVPAAELR 
P\RKTSLFNHMASHGPREAVSGLAKSRDYLLTLR\RRGSSTQDS 
CMARTPCP / PHACCLS PSLI RSEVEFLKMDFNWRMKEVLVSSML 
SAYYVAFVPWFVKNTHYYDKRWSCELFLLVSISTSVILMQHLL 
PASYCDLLHKAAAHI^CWQKVDPALCSNVr^HPWTEECm T PQGV 
LVKKSKNVYKAVGHYNVAI PSDVSH FRFHFFFSKPLRILKIIiLL 
LEGAVI VYQLYS LMS SE KWHQTISLALI LFSNY YAFFKLLRDRL 
VLGKAYS YS AS pqrdldhrf s 


6091 


3279 


412 


SSRTREMEEKBILRRQIRLLQGLIDDYKTLHGNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSMRKKYSLVNRPPG 
PSDF PADHAVR PLHGARGG Q P P VPQQHVLERQVQLS QGQNWI K 
VXPPSKSGSASASGAQRGSLEEFEDTPWSDQRPREGEGEPPRGQ 
LQPSRPTRARGTCSVEDPLLVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVSESVIAVKASFPSSALPPRTGVALGRKLGSHSVASCAPQ 
LLGDRRVDAGHTDQPVPSGS VGG PARPASGPRQAREAS LWTCR 
TNKFRKNNYKWAASSKSPRVARRALSPRVAAEWCKASAGMA^ 
KVEKPQLIADPEPKPRKPATSSKPGSAPSKYKMKASSPSASSSS 
SFRWQS EAGSKDHAS QL3 PVLSRS PSGD \R PAVGHSGLKPLSGB 
TP LSAYKVKSRTKI I RR RGS TS LPGDKKSGTS PAATAKSHLSLR 

RRQALRGKSSPVIjKKTPNKGLVQVTTHRLCRLPPSRAHLPrKEA 
SSLHAVRTAPTSKVI KTRYR I V JCKTPAJ5P IVS Jk ddpdt cr netm* 

RRLSLSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYRCIGGVLY 
KVS ANKL S KTS GQP S DAGS R PI*LRTG RLD P AGS CSRSLASRAVQ 

RSIJaiRQARQRREKRKSYCMYYNRFGRCNRGBRCPYIHDPEKV 
AVCTRFVRGTCfQCTDGTCPFSHHVS KEKMP VCS YFLKG I CSNSN 
CPYSHVYVSRKAEVCSDFLKCYCPLGAKCKKKHTLLCPDFARRG 
ACPRGAQCQLLHRTOKRHS RRAATS PAPGPSDATARS R VS AS HG 
PRKPSASQRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
SSSSSSSS P PASI^HEAPSLQEAALAAACSNRLCKLPSFXSLQS 
SPS PGAQPRVRAPRAPLTKDSGKPLHI KPRL 


6092 


143 


3190 "] 
3 


\KAPPTGESSEPEAKVIJJTKRLYRAVVEAVHR3jDLILCNKTAYQ 
rVFXPEWISliRNKLRELCVfCLMFLHPVDYGRKAEEI^lTRKVYYE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«»Leucine, M»Methionine , fcfeAsparagine , 
P=Proline, Q=Glut amine, R=;Arginine, 
SoSerine, T«Threonine, VaValine, 
WaTryptophan, Y=Tyrosine, X=u*nknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








VIQ L I KTNKKH IHSRS TfiECAYRTHLVAGlGF YQHLLL Y IQSH Y 
QLBLQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQMACHRCLVY 
LGDLSRYQNELAOVDTBLLABRFYYQALSVAPQIGMPFNQLGTL 
AGSKYYNVEAMYCYLRCIQSEVSFEGAYGmjCRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNPMYIiQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYBSGYAFLPDL 
LIFQMVIICLMCVHSLERAGSKQYSAAIAFTLALFSHLVNHVNI 
RLQAELEBGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDLSEGFESDSSHD 
SARAS EG SDSGSDKSLEGGGTAF DAE TDSEMNSQESRSDLEDME 
EEEGTRSPTLEPPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQ 
MFQTKRCFRIAPTFSNLIiLQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLQVLMAEGLLPAVKVF 
LDWLRTN PDL I IVCAQ S SQSLWNRLS VLLNLLPAAGELQESGLiA 
LCPEVQDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPLLSTLEE5WRICCIRSFGHFIARLQGSILQFNPEVGIP 
VS I AQS EQESLLQQAQAQFRMAQEEARRNRLMRDMAQLRLQLEV 
SQLEGSLQQPKAQSAMSPYLVPDTQALCHHLPVTRQLATSGRFI 
VII PRTVIDGLDLLKKBHPGARDGIRYLEAEFKKGNRYIRCOKB 
VGKSFERHKLKRQDADAWTLYKILDSCKQLT\LAQGAGEEDP5G 
M VT J I TGLPLDNPSLLSGPMQAALQAAAHASVDI KNVLDFYKQW 
KEIG 


6093 


76 


1002 


ACGRRAMLALRVART/SRWGAL\RGAVWAPGTRPSKRRACWALL 
P P VPCCLG CIiAERWRLRPAALGLRL P G IGQRNHCSG AG KAAPR \ 
PAAGAGAAAJEAPGGQWGPASTPSI»Y BNPWTI PNMLSMTR igi*ap 
VLG YL 1 1 EED FNI ALG VFALAGLTD L LDGFI ARNWANQ RS ALG S 
ALD PIiADK I L I S I LYVSI/TYADIjI P VPLTYM 1 I SRDVMIjI AAVF 
YVRYRTL PTPRTLAKY FNPC YATARL K PTFI S KVNTAVQLI L VA 
AS LAAP VFOTADS I YLQI LW CFTAFTTAASAYS YYH YGRKTVQ V 
IKD 


6094 


23 


1010 


PFLRCLRGDQKAKMSERKVLNKYYPPDFDPSKIPKLKLPKDRQY 
WRLMAP FNMRCKTCGEYIYKGKKFNARKETVQNE VYLGLPIFR 
FY I KCTRCLAE I TFKTD PENTD YTMBHGATRNFQAE KLLE EBB K 
RVQKEREDEELNNPMKVLENRTKDS KLEMEVLENLQELKDLNQR 
QAHVDFEAMLRQHRLSBEERRRQQQEEDEQETAALLEEARKRRL 
tiEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLVWKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDL1DRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFIAKGEVPKGSCE\DEPMD 
STMDDAVAGDFAJLINKLD IQCDLKTLSDDI KESLESEGKNS KKB 
BPQELLQSQDFVGEKLGSGEPSHS 



TRADOCS;1416257.l(%CSH0I l.DOC) 
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&EQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide*" 
(A«Alanine, C=Cysteine, EfeAspartic Acid, B= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N-Asparagine, 
P=Proline, G=Glutamine, R-Arginine, 
SaSerine, T=Threonine, V=Valine, 
W-Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHTVPKPGKGADLSKPPCRKAKEIRKBRKRLkt^lQQNPAGEL 
EGPQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
WRSSPPSSQPKATLLESyQVyKRYQMVIHKNPPDTPTESQFTR 
FLCSSPLEAETPPNGPDCGYGSFHQQYWLDGKIIAVGVIDILPN 
CVSSVYLYYDPDYSFLSLGVYSALREIAPTRQLHEKTSQI^YYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWVPIEQCLPSLENSK 
, YCRFNQDPEAVDEDR5TEPDRLQVFHKRA I M PYGVYKKQQKDPS 
E EAAVLQ YAS LVGQKCS ERMLLFRN 


6096 


2277 


575 


URVRAALLSSAMEDSEAl^FEHMGLDPRLL(iAVTDti3WSRPTIiI 
QEKAIPLALEGKDLLARARTGSGKTAAYATPMT,OT,T t tjd vatytj 

WEQAVRGLVLVPTKELARQAQSMIQQLATYCARDVRVANVSAA 
EDSVSQRAVLMEKPDVWGTPSRILSHLQQDSLKLRDSLELLW 
EE ADLLFSFG FEEBL KSLLCHLP R I YQAFLM SATFNEDVQALKE 
LILHNPVTLKIjQESQLPGPDQLQQFQWCETEEDKFLLLYALLK 
LSLIRGFCSLLFVNTJDERSYRLRLFLEQFSIPTCVLNGELPLRSR 
CHI ISQFNQGFYDCVIATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGIDFHRVSAVLNFDLPPTPEAY1HRAGRTARANNPGIV 

ltfvlpteqfhlgkiebllsgenrgpillpyqfrmeeiegfryr 
crdamrsvtxqairearlkeikebllhsbklktyfednpr\dzjq 

LLRHDLPIiHPAWKPHLGHVPDYIiVPPALRGtiVRPHKK\GRSCL 

plvgrpreqsprthcaasstksrnsdpqpsppevvgplws 


6097 


1673 


192 


APGTMSGGKKKS S FOI TS VTT DVF fi vrz g arzn c n d n^nnn n mn n ^ 1 ■ 

PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 
YRRGR WTCVD VYERDLEPH S FGGXL EG I RGAS GGAGGRSLD SRL 
ELAS LGLGAPT P PSGLSQG PTS WLRP P P TS PG PQARS FTGGLGQ 
LWPSKAKABKPPLSASSPQQRPPEPETGESAGTSRAATPLPSL 
R\^.EAGGSGARTPPI^RRKAVDMRLRMELGAPEEMGQVPPLDS 
RPSS PAliYFTHDASLVHKS PDPFGAVAAQKFSIAHSMLAI SGHJj 
DSDDDSGSGSLVG IDNKI EQAMDLVKSHLMFAVREEVEVLKEQI 
RELABRNAALEQBNGLLRALA\SPEQLGSAGPPRGVPR\ LGPPA 
PNGPFVLSLPSLT I VPI/5L PGLASAAWP PLPMPAL IVP VF PGVG 

VQALSNGPWSPGPLPHLLIIPSLDGGGEGFRTGRQQGAPFGEET 
QPPPSLPGTPQQ 


6098 
6099 


168 


1074 


NYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKElib 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASRSPEKCAQQRQK 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPKE 
NLS PGFSHLLS KMESS PIRFD I I*LDDLDTVP VSTLQRTN PR KQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGm>FBYTAKIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRSFGKD 


6100 " 


168 


1074 


«^UK^caPLEKI)SSPGSSSTSI,LlKKQRBTSDTPIMiiALKEU> 
EG K I FKNWGTQTE KEDTSNINPRQXETS VNASRS PEKCAQQRQK 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPZE 
NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQI. 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
~^ r ^^- rk,D v *> * &\j^JbJr vi> vi ^ sjtjNDrE YTAKJ R TLAETERFF\ D 
ELTKEKDQ I EAALSRM P S PGGR ITIiQTRLNQEAFGRS FGKD 


6101 - 


2 


713 " 


FVEV3GYRSRADPEPRGRDTMTYAYLFKYIIIGDTGVGKSCLLL 
QFTDKRFQPVHDLTIGVEFGARMVNIDGKQIKLQIWDTAGQESF 
RS I TRS YYRGAAGALL VYDITRRETFNHLTS WLEDARQHSS SNM 
VIML I GNK5DLESRRDVKREEGEAFARE\HGLIFMBTSAKTACN 
VEEAFINTAKEI YRKIQQGLFDVHNEANGIKIGPQQS ISTSVGP 
SASQRNSRD IGSNSGCC 




1 


1399 

< 
j 


FRGRAWPLREVSHWLGCRRVCSWSASWGRI.PALSARLSPLLAFR 
3KMVF P LSCAVQ QYAWGKMGSN3 E VARLLASSDPIAQIAED KP Y 
^LWMGTHPRGDAKILDNRISQKTLSQWIAEITODSIXSSKVKDTF 
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beginning 
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to first 
amino acid 
residue of 
amino acid 
sequence 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B* 
Glutamic Acid, F« Phenyl a 1 anin e , G=Glycine, 
HaHistidine, I=Isoleuczne, K=Lysine, 
Leucine, (^Methionine, N=Asparagine , 
P= Proline, Q-Glut amine, R=Arginine, 
S»Serine, T^Threonine, V&Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNIoPFLFlO/LSVETPLSIQAHPNKEiiAEKLHLQAPQHYPDANH 
KP EMAI ALTP FQGLCG FRFVE E I VTFLKKVP E FQFL I GDEAATH 
ItXQTMSHDSQAVASSLQS CF S HLMKSBKKVWBQLNLLVKR ISQ 
QAAAGNNMEDIPGELLLQLHQQYPGDIGCFAIYFLNLLTIiKPGE 
AMFLEANVPHAYLKGDCVEC^4ACSDNTVRAGLTPKFIDVPTLCE 
MLSYTPSSSKDRLFLPTRSQBDPYLSIYDPPVPDFTTMKA\EVP 
G \ S VTEYKDLALDSAS I LLMVQGTVXASTPTTQTPI PLQRGGVL 
F IGANESVS LKLTE PKDLLI FRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAG^IGASPAAPCCSESGDERKN""' 
LEE KSD INVTVLIGS KQ V5 EG TDNG DLPS YVS AFI E KEVGNDLK 
SLKKLDKLIEQRTVSKMQLEEQVLTISSEIPKRIRSALKNAEES 
KQFLNQFLB QETHLFSA INSHLLTAQP WMDDLGTM I S Q IEE I ER 
HLAYLKW I S Q IEELSDNI QQYLMTNNVp JJAASTLVSMAELDI KL 
QESSCTHLLGFWRATVKFWHKILKDKLTSDFEEILAQLHWPFXA 
P PQS QTVG LSR PAS APE I YS YLETL FCQLLKLQTS HELLTEP K\ 
HS QKNTL FLP PLLS S /WP I QVMLT PLQKRFRYHFRGNRQTNVLS 
KPEWYLAQVLMWIGNHTEFLDEKIQPILDKVGSLVNARLEFSRG 
LMMLVLEKLATDIPCLLYDDNLFCHLVDBVLLFERELHSVHGYP 
GTFASCMHILSEBTCFQRWLTVERKFALQXMDSMLSSEAAWVSQ 
YKDITDVDEWKVPDCAETFMTLLLVITDRYKNLPTASRKLQFLE 
LQKDLVDDFRIRLTQVMKEETRA51^FRYCAILNAVNY7SrVLA 
DWADNVFFLQLQQAALEVFAENNTLSKLQLGQLASMES SVFDDM 
INLLERLKHDMLTRQVDHVFREVKDAAKLYKKERWLSLPSQSEQ 
AVMSLSSSACPLLLTLRDHLLQLEQQLCFSLEKIFWQMLVEKLD 
VYIYQBI ILANHFNEGGAAQLQ FDMTRNLFPLFSHYCKRPENYF 
KHIKEACI VLNLNVGSALTAGKDVLPVQLQGS FPAT 


6103 


207 


2523 


ESNSTMTTYLEFIQQNEERDGVRF^SW^PSSRiEATRMVVPVA 
ALFTPLKERPDLPPIQYEPVLCSRTTCRAVLNPLCQVDYRAKLW 
ACNFCYQRNQ F PPS YAG I S ELNQPAE LLPQFS S IEY WLRGPQM 
PLI FIi YWDTCMEDBDLQALKESMQMS LSLL PPTALVGLI TFGR 
MVQVHELGCEGISKSYVFRGTKDLSAKQLQBMLGLSKVPVTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGELQRDPWPVPQGKR 
±-uR5iuvALi ±Av\*imtijh\*l r PNU. GAR I MMF I GG P ATQG PG M WG 
DELKTPIRSWHDIDKDNAKYVKKGTKJ^FEALANRAATTGHVIDI 
YACALDQTGLLEMKCCPNLTGGYMVMGDSFNTSLFKQTFQRVFT 
KDMHGQFKMGFGGTIiE I KTPR\B I KISGAIGPCVSLNSKGP CVS 
ENEIGTGGTCOWKICGLSPTTTLAIYFEWWQHNAPIPQGG\RG 
A\ IQFVTQY\QHSSGQRRIRVTTIAPn\ Wflnar»TnTn-KTT a a Qi?n 

OEAAAILMARLAI YRAETEEGPDVLRliTLDRQLI RLCQKFGE YHK 

ddpss fr fsetfs lypqfm fhlrrss flq vfnns pdes s yyrhh 
fmrqdltqsi>imiqpilyaysfsgppepvlldsssiladrii,lm 

DTFFQ IL I YHGET I AQWR KSG YQDMPE YENFRHLLQAP VDDAQE 
ILHSRFPMPRYIDTEHGGSQARFLLSKVNPSQTHNNMYAWGQES 
GAP I L TDDVS L QVFMDH LKKLAVSS AA 


6104 


124 


732 


KVSEYI ILSKDKI LFHALAM I.VI ,WS PNS AARG VLRN YWERLLR 

iOiPQSRPGFPSPPWGPALAVQ\AQPCLQSQQMIPVEVKRI/RSL 
LDSI FWMAAPKNRRTIEVNRCRRRNPQKLI KVKNNI DVCPECGH 
LKQKHVLCAYCYE KVCKETAE IRRQ IGKQEGQPFKAPTIETWL 
YTGETPSEQDQGKRI IERDRKRPSWFTQN 


6105 


3 


983 


PLHGACTSIiVLQRFCHRRPRPCAPARPEDMRRPAAVPLLLLLCF - 

gsqrakaatacgrprmlnrmvggqdtqegewpwqvs IQRNGSHF 

CGGSLIAEQWVLTAAHCFRNTSETSLYQVLLGARQLVQPGPHAM 
YARVRQVESNPIiYQGTASSADVALVELEAPVPFTNYILPVCLPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPIIDT\PR 
CNLLYSKDTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGQSWLQAGVISMGEGCARQNRPGVYIRVTAHHNWIHRIIPK 
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Atna.no acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, Fo Phenyl alanine, G=Glycine, 
H-Hiatidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine , N«Asparagine, 
P=Proline, QoGlut amine, R=Arginine, 
S-Scrir.e, T=Threonine, V=Valine r 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 
LGVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


GRPPTAPHTGRP PTANRGDPRLDLKRGCARLLTSI ESRGRPAAS 
AGLRRDRCALRRWPLRRAPLARATRRRAGSPRRCAPRPRACPQG 
! MSRARHQPGGLCLLLLLLTOFMEDRSTVQAGNCW^QAKNGRCQV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNC 
IPCKETCBNVDCGPGKKCRMNKKNKPRCVCAPDCSNITWKGPVC 
GLDGKTYRNECALLKARCKEQPELEVQYQGRCKKTCRDVFCPGS 
STCV\VDQTNNAYCVTCNR1CPEPASSEQYLCGNDGVTYS\SAC 
HLRKATCLtiGRS I GLAYEG KCI KAKS CEDI QCTGG KKCLWDPKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLE VKHSGSCNS I S EDTEEBEEDEDQDYS F PI S S ILEW 


6107 


623 


168 


SRCSS PRPEPGRGRGK/ LS PSEHRKWVEVPKACDEDHKGYLSftE" " 
DFZTAVVMLFGYKPSKIEVDSVMSSINPN1SGILLEGFLNIVRK 
KKEAQR YRNEVRH I FTAFDTYYRG FLTLBDFKKAFRQVAPKLPE 
RTVLEVFREV\DRDS\DGHVSF 


6108 


3 


1348 


GGSLRFSPPRVPSGSRVFCPVPPGGCGLPSPMSASRPQSPTTPW" - ! 
CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFVVMCCSMLVLL 
YYFYDIiLVY VVIGI FCLASATGL YSCLAPCVRRLP\ SASAGBS A i 
L1APTIPWSLPYFHKRPQARMLKLALFCVAVSVVWGVFRNEDQ 
WAWVLQDALGIAFCLYMLKTI RLPTFRACTLLLLVLFLYDI FFV 
FI TP FLTKSGS S I MVEVATGPSDSATREKLPM VLKVPRLNS S PL 
ALCDRPFSLLG FGDILVPGLLVAY CHRFDIQVQSSRVYFVACTI 
AYGVGLLVTFVALALMQRGQPALLYLVPCTLVTSCAVALWRREL 
GVFWTGSGFAECVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATSPWPABQS PKSRTSEEMGAGAPMREPGS PAES EGRDQAQPS 
PVTQPGASA 


6109 


1 


1381 " 


CKSHAGAASGG AI LEGTKLRJIQRVDTNKPLDPLVPS ALRAAMLY 
LEDYLEMIEQLPMDLRDRFTEMREMDLQVQNAMDQLEQRVSEFF 
MNAKKNKPEWREEQMASIKKDYYKALEDADEKVQLANQI YDLVD 
RHLR KL DQELAKF KMELE ADNAGI TE ILERRSLELDTPSQPVNN 
HHAHS HTP VEKRKYN PTSHHTTTDH I PEKKFKS E AL LSTLTS DA 
S KENTLGCRNNWS TASSNNA YNVNS SQPLGS YNIGSLSSGTGAG 
GX\TMAAAQAVQATAQMKBGRRTSSLKASYFJVFKNMDFQLGKEF 
SMARETVGYSSSSALMTTLTQNASSSAADSRSGRKSKNNNKSSS 
QQSSSSSSSSSLSSGSSSSTWOBISQQTTWPESDSNSQVDWT 
YDPNB PR YC ICNQVS YGEM VGCDTQDCP I EWFKYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


77 


2464 


ACP5AATMSDQDHSMDEMTAWKIEKGVGGNNGGNGNG3GAFSQ 
ARSSSTGSSSSTGGGGQBSQPSPLAIJiAATCSR IES PNENSNNS 
QGPSQSGGTGELDLTATQLS QGANGWQI ISSSSGATPTSKEQSG 
SSTNGSNGSESSKNRrVSGGQYWAAAPNLQNCX2VLTGLPGVMP 
NIQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGQIQI IPGANQQ 
I ITNRGSGGNT IAAMPNLIiQQAVPLQGLANNVLSGQTQYVTNVP 
VAIiNGNITLLPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 

GTTISSASLV^SOm ^SQPFTMllWCVCTTTTTCMM/' rMxiammn^ 

v * **«*/wuvo<9y*wooorr aiN>\iM£>i o 1 ill IoCipHjIPINFTTSG 

s 5gtn5 qgqtpqrvsglqgsdalni qqnqts ggs lqagqqkegb 
qVnmtg^pksi.srpqlvo^gXqalqxafqaaplsgqtfttqa 
isqetlqnlqlqavpnsgpi iirtptvgpngqvswqtlqlqnlq 
vqnpqaqti tlapmqgvs lg qts ssnttltp i asaas i pagt vt 

VNAAQI^SMPGI^TINLSALGTSGIC^HPIQGLPLAIANAPGDH 
G AQLGLHGAGGDG IHDDTAGGEEGBNS PDAQPQAGRRTRREACT 
CP YCKDSEGRGSGDPGKKKQHI CH I QGCG KVYGKTS HLRAHLR W 

htoerpfmctwsycgkrftrsdelqrhkrthtgekkfacpecpk 

R FMRS DH L S KHI KTHQN KKGG PG VALS VGTLP L OS GAGS EGSGT 

ATPSAXITTNMVAMEAICPEGIARLANSGINVKEGGQFCSPINT 
SANGF 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(AWUanine, C»Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=>Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XoUnknown, *oStop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 


5112 


16-37 
1 77 


797 
196 


KVDPR VRQAMAP WG KRLAGVRGVLLD I SGV JU'YDSGAGGGTAIAG" 
SVEAVARLKRSRIjKVRFCTNESQKSRAELVGQLQRLGFDI seqe 
VTAPAPAACQI LKERGLRPYLIi I HDGV\AS E FDQIDTS /STPNC 
VV I ADAGE S FS YQNKNNAFQVLMELBKPVLI SLG KGRYYKETSG 
LMLDVGPYMKALE YACG I KAEVGG KPS PE PFXS ALQA I G VEAHQ 
AVM IGDD I VGD VGGAQRCGMRALQ VRTGKFR PSDEHHPE VKADG 
YVDNLAEAVDLLLQHADK 


6113 


1779 


567 


msshksfkskrflakkqkpnrpilqwiwlktgnkxrMwi? 

WEGRSWAATOVNZiQGAWGERSGVRASEAEdPGltRAbVSWWSRQL 
ETMVDHLANTE INSQRIAAVESCFGASGQPLALPGRVLLGEGVL 
TKECRKKAKPR IFFLFNDILVYGS IVLNKRKYRSQHIIPLEEVT 
LELLPETLQAKNRWMIKTAKKSFVVSAASATERQEWISHIEECV 
RRQLRATGRP A \ STEHAAP W I PDKATD I CMRCTQTRFSALTR RH 
HCRKCRWVCAECSRQRFLLPRLSPKPVRVCSLCYRELAAOQRK 
EEAB EQGAGVPRAASHIARP ICGR PVEMTMTPTRTRRAAG^ATG 
PAAWSSTPRGWPGLPSTADPRPAEHliS PSQLHCPG PQEGS SRSC 
PGLRD PIP WKQ VQRWGVALSGt»PVP FCWTLCP YGFTAGNAFPF^ 
KPQNTHRSW 


6114 
6 115 


818 


246 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCTPVPACWPSPPP\PABQVQC 
GHLPPHADRRALRLPVAAPARG PGPGHPAGPAGPRPARTP PAS ° 
HGPGRPT VpAP P CPLLAATE PT PSR PHQRWTRB DRMLGRGSQVT 
GRPQWFTiRGLVLFSL 


~ 6116 


324 


71 


V VU5RVCAHPHLYTHIHMHICAHAC \IHTHAQLC/ ITASHALAH 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR 


6117 | 


595 


1430 


TtiVMPPGRWRAA; I SSSGPVFEGARA\LQTVKKSEEDE3YTPVQ 
AARPQTLNRPGQEL FRQL FRQLRYHES S GPLETLSRLRELCR WW 
LRPDVLSKAQILELLVLEQFLSILPGELRW^QLHNPBSGEE\L 
W PCWRS CRGTLMGHPGGTRALP\ E PRCALDGYRS \LRS AQI WS L 
ASPLRSSSALGDHLEPPYEIEARDFIiAGQSDTPAAQMPALFPRE 
GCPGDQVTPTRSLTAQLQETMTFKDVBVTFSQDEWGWLDSAQRN 
LYRDVMLENYRNMASLGK 


6118 


1433 


222 


vi^vjjsfappcswbvgTgggwt^ 

GLLSLFP PAAMHPAAFPIjPVWAAVLWGAAPTRG LI RATSDHNA 
SMDFADLPAliFXSATLS QEGLQGFLVBAHPDNACS P IAP? PPAP V 
NGSVFIALLRRFDCNFDLKVnCiNAQKAGYGAAVVHKnmSNEIjLNM 
VWNSEBIQOQIWIPSVFIGERSSEYLRALFVYEKGARVLLVPDN 
TFPLGYYLI PBTGI VGLLVLAMGAVMIARCIQHRKRLQRNRLTK 
\EQLKQI \ PTHDYQKGDQYDVCAI CLDE YEDGDKLRVLPCAHAY 
HSRCVD PWLTQTR KTCPI CXQP VHRGPGDEDQEEETQG QEEGDB 

GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


"6119 


1044 


247 


STI 3 CRACTSGAX PGAQSHRS ARGHAAGGKETAALGMERGFCVKK~ 
KEXBKETQKE3CIGEKGREEKVKRKEVEQKIKQEKQEKQERRKGK 
EKEEKRTKQGKETNKEKEQFKGQEEKGENKDSTLTRTPLEPLEK 
NKQ I LVLGLDG AGKTS VLHSLASNRVQHS VAPTQG FHAVCI>JTE 
DS QMEFLB IGGS KPFRS YWEM YLSN/ADS LARSFS VGFKQDSQP 
ITWKAKKY LHQ L I AANPVLPLWFANKQDLEAAYHI TDIHEALA 




1217 


462 

J 
< 


DPRFVTENTTKAPAOBRTTQPRSSREGTLRSTMEYLSALNPSDL 
LRS VSN IS SEFGRR VWTSAP P PQRPFRVCDHKRTIRKGLTAATR 
QBLliAKALETLIil^GVLTIiVI^EDGTAVDSEDFFQIiEDDTCLM 
VLQSGQSWSP7RSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 
DLFGSLNVKATFYGLYSMSCDFOGli\GPKKVLRELLRWTSTLLQ 
3LGHMLLGI SSTLRHAVBGAEQWQQKGRLHS Y 
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Predicted end 
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Amino acid segment containing signal peptide""" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, En 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H"Histidine, Islsoleucine, KeLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, QsGlutaraine, R=Arginine, 
S=Se rine, T»Threonine, V« Valine, 
^Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


| 765 


179 


LERAGGOGIiSSRALVGSGACLS LVARANGKGLPRGRKEFVEAVR 
VRYVAFR YRTPRAVCLRLWS CRRB V I MS GRGKQGGKVRAKAKSR 
SS RAGLQFPVGRVHRLLRKGNYAERVGAGAPVYLAAVLEYLTAB 
I LELAGNAARDNKKTRI I PRHLQLAIRNDEELNKLLGKVTIAQG 
G\VLPNIQAVLI*PKKTBSQKDBGANDP 


6121 


1612 


107 


FVRAOARGS RQP VRRPLLGAGS RLRCRS CGRMPpt.vv i<"(fVflTnw 
RGNGLRAVTPI,RPGBLLFR5DPLAYTVCiCGSRGWCDRCLI/3KB 
KLMRCSQCRVAKYCSAKCQKKAWPDHKRBCKCLKSCKPRYPPDS 
VR LLGR WFKLMDGAPS ESEKL YSF YDLESNINKLTED XKEGLR 
QLVMTFQHFMREEIQDASQLPPAFDLFEAFAKVI CNS FTICNAE 
MQEVGVGLYPS ISLLNHSC3DPNCSI VFNGPHLLLRAVRDIEVGE 
ELTICYLDMIiMTSEERRKQLRDQYCFECD\CFRCQTQDKDADML 
TGDEQ VWKE VQESLKKI EELKAHWKWEQVLAMCQAI I S SNSERL 
PDINIYQLKV^DCAMDACINLGLLEEALFYGTRTMEPYRIFFPG 
SHPVRGVQVMKVGKLQLHQGMFPQAMKNLRLAFDIMRVTHGREH 
SLI EDLILLLE/ AMR RQ HQS I LRERSQ RBI RRVSLLNALLRSHT 
LCFVSCVNLSYWKFCSVFV 


6122 


2 


2324 


RFR KMADGGAASQDESSAAAAAAADS RMNN P S ETS KPSMESGDG " 
NTG TQTNGLD FQKQ P VP VGGAI STAQAQAFLGHLHQ VQLAGTSI* 
QAAAQSLNVQSKSNEESGDSQQPSQPSQQPSVQAAIPQTQLMLA 
GGQ I TGIVTLT PAQQQLLLQQAQAQAQLLAAAVQQHSASQQHS AA 
GAT I SASAAT PMTQ I PLS Q P I Q I AQ DLQQLQ QLQQQNLNLQQFV 
LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNLLTQLPRQSQAN 

"uwowiftl \ A.U.L o^c'.MJl Jr A W»X J^ini Ir uir^o Uw X. lr.IvK._L DTPS 

LEBP\SDLEELEQFAKTFKQRRIKLGFT\QGDAGLAMVKLYGND 
FSPTTIFRFEAimSFKNMCIOKPLLEKWriNDAENLSSDSSLSS 
PS ALNSP G I EGIiS RRRKKRTSIEA\N IRVALEKS FLEK\ QKPTS 
EEITMIADQLNMEKGV3RVWFCNRRQKEKRINPPSSGG\TSSSP 
IKAIFPSPTSIiVATTPSLVTSSAATTLTVSPVLPIiTSAAVTNLS 
VTGTS DTTSKNTATVI S TA P PASS AVTS PSLS PS ? SASAS TSEA 
SSASETSTTQTTSTPLSSPLGTSQVMVTASGIiQTA/AQLLPFKG 
AAQLPANASIJUmAAAAGLNPSU^SQFAAGGALLSLNPGTLS 
GALSPALMSNSTLATIQALASGGSLPITSLDATGNLVFANAGGA 
PNIVTAPLFLN PQNLS LLTSN P VSLVS AAAAS AGNS APVASLHA 
TSTSAES IQNS LFTVASASGAASTTTTASKAQ 


6123 


3 


2944 


HLLHRWFGTDMQMINFTTGEFQLTEACPYLGTHSEESRFGILHL 
HLQPLEMKRVGVVFTPADYGKVTSI.ILIRNNLTVIDMIGVEGFG 
ARELLKVGGRLPGAGGSLRFKVPESTLMDCRRQLKDSKQILS IT 
KMFKVENIGPLPITVSSLKINGYNCQGYGFEVJjDCHQFSLDPNT 

srdis ivftpdftss wvirdlslvtaadlefrftlnvtlphhll 
plcadwpgpsweesfwrltvffvslsllgviliafqqaqyilm 
efmktrqrqnass s s qqnngpmd vis phs yksncknfldt ygps 
dkgrgrajclpvktpqsriqnaakrspatyghsqkkhkcsvyysk 
hktstaaasststtteekqts plgsslpaakedictdamrenwi 
s lr yasg i nvnlqknltlp kn llnkeentlknt1 vfsnpssbcs 
mkegiqtcmfpketdiktsentaepkerelcplktskklpenhl 
prnspqyhqpdlpeisrknngnnc^vpvknevdhcenlkkvdtk 
pss ekk1hktsredmfsekqdi pfveqedpyrkkklqekregnl 
qnlnws ks rtcrknkkrg vap vsrppeqsdliklvcsdferseljs 
s dinvrswciqbstrevckadae iasslpaaqreaegyyqkpek 
kcvdkfcsdsssdcgsssgsvrasrgswgswsstsssdgdkkpm 
vdaqhfl pagds vsqndfpseap islnlshni cnpmtgnslpq y 
ae pscpsl pag ptgveedkgl ys pgdlwptpp vcvts slnctle 
ngvpcviqesapvhnsfidwsatcegqfssaycplblndynafp 
e enmn yangfpcpadvqtdfidhns qstwntp p\nmpas \ wgna 
qfpsssrpylkstpkaclpmsglfgpi\wap\qsdvyenccpin 
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Karri nr» -i «tj_r 
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amino acid 
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Amino acid segmenc containing signal peptide" 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, KaLysine, 
L^Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, v»Valine, 
W«Tryptophan, Y= Tyrosine, X=Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=»possible nucleotide insertion) 


CI OA 






PTTEHSD/ THMENQA\ WCKEYYPCF \NPFRAYMNLDIWTTT\A~ 
1 NRNANFPLSRDSSYCGNV 




1573 


236 


SDEAliRLAGERGMGRVQLFEISLSHGRWYSPGEPLAGTVRVRL 
GAPLPFRAIRVTCIGSCGVSNKANDTAWWEEGYFNSSLSLADK 
GSLPAGEHSFPFQFLLPATAPTSFEGPFGKIVHQVRAAIHTPRF 
S KDHKCSLVFYI LS PLNLNS I PDIEQPNVASATKKFS YKLVKTG 
S WLTAS TDLRG Y WGQALQLHADVENQSGKDTSP WASLIrQKV 
S YKAKR W IHDVRT IAE VEGAG VKAWRRAQWHEQ ILVPALPQSAL 
PGCS L I H I DYYLQ VS LKA PEATVTliPVF IGNI AV /NPCPSE P PA 
RPGAASWGPTPGG\PSAPPQEEAEAEAAAGGPHFLDPVFLSTKS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPI*HPPLCISTGATV 
PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGVE 


6125 


1 


904 


KTCP KLTCAFTVS VP DSCCRVCRGDGELSWEHSDGDI FRQPANR 
EARHSYHRSHYDPPPSRC2AGGLSRFPGARSHRGALMDSQQASGT 
IVQIVINNKHKHGQVCVSNGKTYSHGBSWHPNLRAFGIVECVLC 
TCNVTKQBCKKIHCPNRYPCKYPQKIDGKCCKVCPG/KKAKEEL 
PGQSFDNKGYFCGEETMPVYESVFMEDGETTRKIALETERPPQV 
EVHVWTIRKGILQHFHIEKlSKRMFEELPHFKLVTRTTLSQWia 
FTEGEAQI SQMCSS RVCRTELEDLVKVLYLERSBKGHC 


6126 
6127 


1224 


389 


RLLSEAPCPRSRRRFQMNPEWGQAFVHVAVAGGLCAVAVFTG1F 
DSVSVQVGYEHYAEAPVAGLPAFUVMPFNSJbVNMAYTLLGLSWL 
HRGGAMGIjGPR YLKD VFAAMALL YG PVQ WLRLWTQWRRAAVLDQ 
WLTLPI FAWP VAWCLYLDRGWRP \ WLFLSLECVSLAS YGUVLLH 
PQGFEVALGAHWPAVGQALRT\HRHYG/ SATPSATYIjALGVLS 

CLGFWLKLCDHQLARWRLFQCLTGHFWSKVCDVLQFHFAFLFL 
THFNTHPRFHPSGGKTR 




1335 


463 


VLPRRCLVF WNTMDSSREPTIjGRIiDAAG FVTQVWQR FDADEIUj'y 1 ' 

IEEKEIJDAFFLHMLMKLGTDDTVMKANLHKVKQQFMTTQDAS KD 

GRIRMKEIAGMFLSEDENFLLLFRRENPLDSSVEFMQIMRKYI3A 

DS S G F IS AAKLRNFLRDLFLHHKKAI S EAKLEE YTGTMMKI FDR 

NKDGRLDLNDLARILALQENFLLQFKMDACSTEKRKGDFEKIFA 

YYDVSKTGALEGP\EVDGFVKDMMELVQPSISGVDLDKFREILL 

RHCDVTTKDGKIQKSBLALCLGLKINP 


6128 


2511 


843 T 


TGRMSRRQLERWVWSSQQVQARGRNVRAPRLGKIAMGLEMSSKD 
SPGSLDGRAWEDAQKPQSAWCGGRKTRVYATSSRRAPPSEGTRR 
GGAARPBKTAEEGPPAA PGSLRHSGPLG PHACPTALPEPQVTSA 
MSSQWGIEPLYIKAEPASPDSPKGSSETETEPPYA1jAPG\PAP 
TRCLPGHKEE E DGEGAG PGEQGGGKLVLSSLPKRLCL VCGDVAS 
GYHYGVASCEACKAFFKRTIQGSIEYSCPASNECEITKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPLPFPGP 
FPAGPIAVAGG PRKTAAPVNALVSHLLVVEPEKLYAM PDPAG PD 
GHLPAVATLCDLFDREIWTISWAKSIPGFSSLSLSDQMSVLQS 
VWMBVL VLG VAQRSLTLQDE IiAFAE YL VLDEEGAR P AGLG ELG \ 

AALLQLVRR^ALRLEREEYVLIJCALALANSDSVHIEDEPRLWS 
SCEKLIiHEALLEYEAGRAGPGGGAERRRAf3HTJ TTT dt t n/v™^ 

KVLAHFYGVKLEX3KVPMHKLFLEMLEAMMD 


6129 
6130 


1764 
3 


771 

I 
I 

| I 

577 ( 


AR^ARSAHHGJWKKKTGARKKAENRREREKQLRASRSTiDLAK 
KPCWASMECDKCQRRQKNRAFCYFCNSVOKLPICAQCGKTKCMM 

KSSDCVI khagvystglamvgai cdfceawvchgrkclsthaca 
cpltdaec\vecergvwdhggrifscsfchnflceddqfehqas 

CQ VIjEAETFKCVS CNRLGQHS CLRCKACFCDDHTRS KVFKQEKG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
ifWKNLSSDKYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 

dlftnlnlgrtyasgyahyeeqen 
jrggtmreykvvvlgsg\gvgksaltv\qfvtctfiekydptie 
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ID 
NO: 
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amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e s pond ing 
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amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(AaAlanine, C»Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, FaPhenylalanine, G-Glycine, 
H»Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine , R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
^-Tryptophan. Y«Tyrosine, X»Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DFxRKEIEV\DSSPSVAGISWTQQGTHQ?VASMRDtYIKiCGQGC~ 
I LVYSLVNQQS FQ \DI KF MRDQ I IRVKVSEKVPVI \ LVGN\S VD 

LESEREVSSSEGRAIABEWGCPFMETSAKSKTMVDELFAE1VRQ 
MNYAAOPDKDDPCCSACNIQ 


6131 


3 


1811 


SSPREKTSDSSHRPSRHGPLFLRLVGLSPFSYLCVPPSRPVPGS 
PRSLSAMRLLPLAPGRLRRGS PRHLPSCS PALLLLVLGGCLGVF 
GVAAGTRRPNWIiLLTDDQDEVLGGMTPLKKTKALIGEMGMTFS 
SAYVPSALCCP5RA5ILTGKYPHNHHWNNTLEGNCSSKSWQKI 
QE PNTFPAI LRSMCG YQTFF \AGKYLNB YG APDAGGLEHVPLG W 
SYWYALEKNSKYY1^TI>SIKGKARKHGENYSWYLTDVLANVSL 
DFLD YKS NFE PFFMMTATP \APHS PWTAAPQYQ KAFQNVFAPRN 
KNFNIHGTNICHWLIRQAKTPMTWS S IQFLDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYI FYTSDNGYHTGQFSbPI DKRQLY 
EFDIKVPLLVRGPGIKPNQTSKMIjVANIDLGPTILDIAGYDLNK 
TQMDGMSLLPILRGASNLTWRS0VLVEYQGEGRNVTDPTCPSLS 
PGVSQCFPDCVCEDAYNNTYACVRTMSALWNLQYCEFDDQEVFV 
BVYI7LTADPDQITNIAKTIDPBLLGKMNYRLMNILQSCSGPTCRT 
P GVFD PG YRFD PRLM FS NRGS VR TRRFS KHLL 


6132 


96 


1241 


aagllppglvpedprrtrniXpfgiqgppfalsrplfscvesgw- 

AWEAMBPEFLYDLLQLPKGVBPPAEEELSKGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEWINATLLPEHIWRSLEEDMFDGLILHHL 
FQR LAALKLEAEDIALTATSQKHKLTVVIiEAVNRS \ CSWRSGRP 
SGA/WESIFNKDLLSTLHLLVALAKRFQPDLSLPTNVQVEVITI 
ESTKSGLKSEKLVEQLTEYSTDKDBPPKDVFDELFKLAPBKVNA 
VKBAI VNFVNQ KLDR LGLS VQNLDTQFADG V ILLLLIGQLEGFF 

lhlkefylxpnspaemlhnvtlalell/ igrgpaqlpc /lalk/ 

TIVKKDAKSTLRVLYGLFCKHTQKAHRDRTPHGAPN 


6133 


2 


4256 

] 


fvkgsmadtdlfmeceeeelepwqkisdviedswedynsvdkt 

TTVS VSQQ P VSAP VP IAAHAS VAGHLS TSTTVS S SGAQNSDSTK 
KTLVTLIANNN^NPIiVQQGGQPLILTQNPAPGLGTMVTQPVLR 

pvqvmqnanhvtsspvasqpifittqgfpvrnvrpvqnamnqvg 

IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FPS P PAVS IAS FVT 

vkrpgvtgensnevaklvntlntipslgqspgpvwsnnssahV 
gsqrtsgpessmkvtssipvfdlqdggrkicprcnaqfrvteai* 
RGHmcyccpemveyqkkgksldsepsvpsaakpps pektapvas 

/THPS S TPI PALS P P Y/TKVP E PNENVGDAVQTKL I ML VDDF YY 

grdggkvaqltnfpkvatsfrcphctkrlknnirfknhmkhhve 
ldqqngevdghticqhcyrqfstpfqlqchlenvhspyesttkc 
kicewafeseplflqhmkdthkpgempyvcqvcqyrsslysevd 
vhfrmihedtrhllcpyclkvfkngnafqqhymrhqkr\nvyh\ 
cntkcrvq flfakdki ehklq hhktfrkp kqleglkpgtkvt I RA 
srgqprtvpvssndtppsalqeaapltssmdplpvflyppvqrs 
ioxravrkmsvmgrqtclecsfeipdfpnhfptyvhcslcryst 

CCSRAYANHMI NNHVPRKSPKYLALFKNSV<?n T ifT.arTcr^nnp 

svgdamakhlvfnpshrsssilprgltwiahsrhgqtrdrvhdr 

NVKNMYP P PSFPTN JCAATVKS AG ATPAE PEE LLTPLAP ALPS PA 

statppptpthpqalalpplategaeclnvddqdegs pvtqbpe 

LAS GGGGSGG VGKKEQliSVKKLRVVLFALCCKTEOAAJEJIFRWPQ 

rrirrwlrrfqasqgenlegkylsfbaeeklaewvltqreqqlp 

VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAIITLPKD VAENAGL F IDFVQRQ I HNQDLPLSMI VA IDE I S LFL 
DTEVLSSDDRKENALQTVGTGEPWCDVVLAILADGTVLPTLVFY 
RGQMDQPANMPDS I LLEAKBSGYSDDEIMELWSTRVWQKHTACQ 
MKGMLVMDCHRTHLSEEVLAMLSASSTLPAVVPAGCSSKrQPL 



452 



WO 01/53312 



PCTYUS00/34263 



SEQ 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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aroino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peot£He""~ 
(A-Alanine, C=Cysteine, D^Aspartic Acid", E= 
Glutamic Acid, F«Phenyl alanine, G«Glycine, 
Ht=Histidine, I»Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutaraine, R«Arginine, 
SaSerine, T=Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X»Unknown, /-Stop 
Codon, /^possible nucleotide deletion, 
\»poosible nucleotide insertion) 








DVC IKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVtjWLGE V 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSBSSTPRPRSSPEETIEPESLHQLPEGESETES 
FYGFEEADLDLMEI 


6134 


2 


425£ 


PVHG^MADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAriASVAGHLSTSTTVSSSGAQNSDSTK 
KTLVTLI ANNNAGNPLVQQGGQPLI LTQNPAPGLGTMVTQPVLR 
P VQVMQNANHVTSSPVASQPI FITTQGFPVRNVRPVQNAMNQVG 
I VLNVQQGQTVKP ITLVPAPGTQFVKPTVG VPQ WSQMTP VRPG 
STMPVRPTTNTPTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQ P TSLGQLAVQS PGQSNQTTNPKLAPS F PS PPAVS IAS PVT 
VKRPGVTGENSNEVAKLVNTLNTI PS LGQSPGP WVSNNS 3AH\ 
GSQRTSGPES SMKVTSS I PVFDLQDGGRKI CPRCNAQFRVT2AL 
RGHMCYCCPBMVEYQKKGKSLDSBPSVPSAAKPPSPEKTAPVAS 
/THPS3TPI PALSPPY/TKVPEPNENVGDAVQTKLlMIiVDDPYY 
GRDGGKVA^LTNFPKVATSFTlCPHCTKRIiKNNIRPMNHMKHHVE 

LDQQNGEVDGHTI cqhcyrqfstp fqlqchlenvhs p y es ttkc 
KI CB WAFES E PLFLQHMXDTH KPGEMP YVCQVCQYRSS LYS E VD 
VHFRMIHEDTRHI^CPYCLKVFKNGNAPQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 

srgqprtvpvssndtppsalqeaapltssmdplpvflyppvqrs 
iqkravrkmsvmgrqtclecsfeipdfpnhfptyvhcslcryst 

CCSRAYANHMINNHVPRXS PKYLALFKNSVSGI KLACTS CT F VI' 
S VGDAMAKHLVFNP SHRS S S ILPRGLTWI AHS RHGQTRDRVHDR 
NVKKMYP P PS FP TN KAATV KS AG AT P AE PEELLTPLAPALPSPA 
STATPPPTPTHPOALAL PPLATEGAECLNVDDQDEGSPVTQEPB 
LASGGGGSGGVGKKEQLSV1CKLRWLFALCCNTEQAAEHFRNPQ 
RRIPJRWI^FXJASQGENI^GKYLSFBAEEKIiAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLS SDDRKBNALQTVGTGE PWCD WLAI LADGTVLPTLVFY 
RGQMDQPANMPDS ILLEAKESGYSDDB IMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAVVPAGCSSKIOPL 
DVCI KRTVKNFIJIKKWKEQAREMADTACDSDVIXQLVLVWLGEV 
LGVIGD C PEL VQRSFLVAS VLPGPDGN INS PTRNADMQEEL I AS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLPEGESETES 
FYGFEEADLDLMEI 


6135 


2 


425$ 


FVHGSMADTDLFMECEEEELE PWQKISDV1 EDS WEDYNSVDKT 
TTVS VSQQ P VSAPVP IAAHAS VAGHLSTSTTVS S SGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPLI LTQNPAPGLGTMVTQPVLR 
PVQVMQNANHVTSS PVASQP IPITTQGPPVRNVRPVQNAMNQVG 
I VLNVQQGQTVRP I TLVPAPGTQF VKPTVG VPQVFSQMT PVRPG 
S TMP VRPTTNTFTT VI PATLTIRSTVPQSQSQQTKSTPS TSTTP 
TATQPTSLGQIAVQSPGQSNQTTNPKLAPSFPS PPAVS IAS PVT 
VKRPGVTGENSNEVAKLVNTIiNTIPSLGQSPGPVWSNNSSAH\ 
GSQRTSGPESSMKVTSSIPVFDLQDGGPJCICPRCNAQFRVTEAL 
RGHMCYCCPEMVE YQKKGKSLDS EPS VPSAAK? PSPEKTAPVAS 
/ THPSS TP I PALS PP Y/TKVPEPNENVGDAVQTKLIMLVDDPYY 
GRDGGKVAQLTNPPKVATSFRCPKCTKRLKNN IRFMNHMXHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKWGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKXiQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSNDPLPVPLYPPVQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSGI KLACTSCTPVT 
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ID 
NO: 


" Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D«Aspartic Acid, e=> 
Glutamic Acid, F»Phenylalanine, G«Glycine, 
HaHistidine, I=»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v» Valine , 
W=Tryptophan, YoTyrosine, X=UjiJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) | 








svgdamakhlvfnpshrsssiLprgltwiahsrhgqtrdrvhjjk 

NVKNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 
STATPPPTPTHPQALAliPPIATEGAECLNVDDQDEG5 PVTQEPE 
lASGGGGSGGVGKKECK^VKKLRVVLPALCCNTEQAAEHFRNPQ 
RR IRRWLRRFQASQGENLEGKYLSFEAEEKLABWVI*TQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRPMtiRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFI* 
DTEVLS S D DRKENALQTVGTG B PNCDWLAI LADGTVLP TLVFY 
RGQMDQPANMPDSIIiLEAKESGYSDDEIMELWSTRVWQKHTAOQ 
RS KGMLVMDCHRTHLS E EVLAMLS ASSTLPAWPAGCSS KIQPL 
DVCIKRTVKNFLHKKWKEQARBMADTACDSDVLLQLVLVWLGEV 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRMADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEBTIEPESLHQLFEGESETES 
PYGFEEADLDLMEI 


6136 


1704 


53 9 


PGVRMALEGMS KRKRKRSVQEGENPDDGVRGS PPEDYRLGQVAS " 

SLFRGEHHSRGGTGRLASLFSSIiEPQrQPVYVPVPK\BSALASA 

DLEEEIHOKQGQKRKNSQPGVKVADRKILD0TBDTWSORXKIQ 

INQEEERLKNERTVFVGNLPVTCNKKKLKSFFKEYGQIESVRFR 

SLIPAEGTLSXKLAAIKRKIHPDQKNINAYWFKEESAATQALK 

RN G AQ I ADG FR I RVD LASE TSS RDKRS VFVGNX. P YKVE ES AJ EK 

HFLDCGSIMAVRIVRDKMTGIGKGFGYVLFBNTDSVHLALKLNN 

SELMGRKLRVMRSVNKEKFKQQNSNPRLKNVSKPKQGLMFTSKT 

AEGHPKSLFIGEKAVLLKTKKKGQKKSGRPKKQRKQK 


6137 


141 


2656 


ralrkrrcgpgrrgalgsgpgpqrrpgrvpeerpapprerRhpg™" 

MWNMLI VAWCLA\LLGI,PGKAQELQGHVS\I ILAGEQLGDLAKK 

ylwqg\lfqi,yldeagrghsfsfhgaaltapkqgqslmakales 
lscpkijmapshcaehkdqflqlsqyrqlktaedyoalnkdieaq 
lqhaglreagg i fyfsvppfayediarninsscrpgpgawijrw 

LEKPFGHDHFSAQQLATELGTFFQEEEMYRVDHYLGKQAVAQXL 
PFRDQNRKALDGLWNRHHVERVEII MKETVDAEGRTSFYBEYGV 
I RD VLQNHLT RVLTLVAME L P HMVS SAEAVLRHKLQVFQALRGL 
ORGS A WGQ YQS YSEQVRRELQKPD SFHS LTPTFAGVI»VH IDKTL 
RWEGVPFILKSGKAIJ>ERVGYAR1LFKNG^CCVQSEKHMAAAQS 
QCLPRQLVFH I GHGDLGS P AVLVSRNLFRP SLPS S WKEMEGP PG 
LRLFGSPLSDYYAYSPVRERDAHSVLLSHIFHGRKNFFITTENL 
LASWNFWTPLLESIAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPBQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEELISKL 
AND I BAT AVRAVR R FGQF HLALS G GS S PVAIiFQQLATAHYGFPN 
AHTHLWLVDERCVPL3DP BSNFQGLQAHLLQHVRIPYYN1H\AM 
PVHLQQRLCAEEDQGAHI YARE I S ALG ANSS FDLVLLGMGADGH 
TASLFPQSPTGTjDGEQLWLTTSPSQPHRRMSLSLPLINRAKIO/ 
AVLVMGRMKREI TTLVSRVGHEPKKWP ISGVLPHSGQLVWYMDY 
DAFLG 


6138 


4587 


934 * 


EFSKLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENtiFRFli 
TDTSHLI»$AVKGQERFS LYOTRSLIHEL KKTKE X H FOR p p tt n n t 

TLEAGEKLLLTTDLKTKESVGRRISQLQDSWKDMBPQLAEMIKQ 

FQSTVETWDQCEKKI kelks rlqvlecaqsedplpblhedlhnek 

EL I KELEQSLAS WTQNLKELQTMKADLTRHVLVBDVMVLKEQI E 
HIJ1RQWEDLCLRVAIRKQEIBDRLNTWVVFNEKNKSLCAWLVQM 
ENKVLQTADI S I EEM I E KLQKD CME EUMLPSENKLQLKQMGDQ L 
I KASNKSRAAE I DDKLNKINDRWQHLFDVI GS RVKKLKET FAF1 
QQLDKNMSNLRTWLAR IESELSKPWYDVCDDQEIQKRLAEOQD 
LQRDIEQHSAGVESVFNICDVLLHDSDACANETEC33S 1QQTTRS 
LDRRWRNICAMSMBRRMK1BETWRLWQKFLDDYSRFBDWLKSAB 
RTAACPNSSEVLYTSAKEELKRFEAFQRQIHBRI/TQLEIjINKQY 
RRLAR BNRTDTASRL KQMVHEGNQRWDNLQRRVTAVLRRIjRHFT 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(An Alanine, OCyoteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycirie, 
H=Histidine, I=Isoleucine, JC=Ly3ine, 
L=Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, WValina, 
W^Tryptophan, Y=Tyro9ine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








NQRBB FEGTRBS ILVWLTEMDLQLTNVEHFSESDADDKMRQLNG 
PQQEITLNTNKlDQliIVFGEQLIQKSEP\liDAVLIEDBLEELHR 
YCQEVFGRVSRFHRRLTSCTPGLEDEKEASENETDMEDPREIQT 

WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PS CPEHH YKQMEGDRNVP P V P PAS STP YKPP YGKLLLP PGTDGG 
KEGPRVLNGNPQQEDGGLAGITEQQSGAPDRWEMIQAQEL\HNK 
LKI KQNL QQLNSD I S AI TT WLKKTEAELEMLKMAKPPSDIQEI E 
LR VKRLQB I LKAFDT YKAL WS VNVSS KE FLQTE S PE S TELQS R 
LRQLS LLWEAAQGAVDS WRGGLRQSLMQCQD FHQLS QNLLLWIA 
SAKNRRQKAHVTDPKADPRALLECRRELMQLE KELVERQPQVDM 
LQ3 1 SNS L L I KGHGEDC I EAEE KVHVI \ E KKL KQLRBQVSQDI»M 

GEEETES RVPGSTRPQRS FLSRWRAALPLQLLLLLLLLLACLIi 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


613S 


52 


1131 


LGDWVWSRTCG\^ETPTSVLRRARARGPCPTDSKWAI*PRLREGE 
TBRRPWEASSWKTL/LAGWIGGAASVIVGHPLDTVKTRLQAGVG 

VHMTT QPTDV\7VT3I?T?CMTrnCT7W/^MCT?I5r SOT R^7VKfOxnr«"/^Trc»evT 
IVjiN lijoLiKVViKKfcoI'li'bf r H-tsMor f UAblAV iWaWfGVf SN 

tqrflsqhrcgepeas p prtlsdlllas mvag ws vglgg p vdl 
ikirlqmqtppvsgrqprfevqgsgscg\epayqgpvhcittiv 
rneglaglyrgasamllrdvpg yclyfi p yvflse w i tpe actg 
pspcavwlaggmagai swgtatpmdwksrlqadgvylnkykgv 
ldci sqs yqkegl kvf prg itvnavrgfpms aamflg yels lqa 
irgdhavtsp 


6140 


694 


13 6 


RPELELWRLRS RS WR P LGV P RRCHRRNWKB P VRAQ PLS VT VWAP 
RCORP/OPPAPEPSSPNAAVPEATPTPPJUlAQAlVX.RT DT/IO&DV 

SVAPQAEAEARSTPGPAGSRLGPETFRQRFRQ FRYQDAAGPREA 
FRQLRBL/SPRQWLRPD I \RTKEQ\ I VEMLVQEQLLAILPEAAR 
ARRI RRRTDVR I TG 




2 


984 


ASRAPARRLVFHAQrAHGSAIGRVEGFSSIQEIjYAQIAGAFEIS 
PSE I LYCTLNTPKIDMERUjGGQLGLEDFI fahvkgiekb vnvy 
KSEDSLGLTITDNGVGYAFIKRIKDGGVIDSVKTICVGDHIESI 
NGENIVGWRHYDVAKKLKELKKEELFTMKLIEPKKAFE3ELRSK 
AGKSSGEK3GCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLE L YMG I RDIDLATTMFEAGKDKVN PDE FAYALDETLGD FAFP 
DEFVPDVWGVIGDAKRRGL 


6142 


116 


602 


EABGEQVCGAKCCGDAPHVENREEETARIGPGVMESKEERALNN 
LlVENVNQENDEKI)EKEQVANKGEPrJlLPLWSEYCVPRGNRRR 

ravrqpii^yrwdimhrlgepqarmreenmerigeevrqlmekl 
rekqlshslravstdpphhdhhdefcXlmp 


6143 


2802 


276 


FRMRIFLHCPWNQQMWKIWNLl^TSLESCKAHLSIQKIiLKER\Q " 
\QLPVFKKMSIVETLKRHRVVVVAGET\GSGKSTQVPHPLLED 
LLLNEWEASKCNIVCTQPRRISAVSIjANRVCDELGCENGPGGRN 
SLCGYQIRMESRACESTRLLYCrrGVLLRKLQEDGLLSNVS/HM 
FI VDEV\HER \S VQSDFLL 1 1 LKE I LQ KRSDLHL I LM S ATVDS E 
KFSTYFTHCPILR I SGRS YPVEVFHLEDI IEETGFVLEKDSEYC 
QKFLEEEBBVTINVTSKAGGIKKYQEYIPVQTGAHADLNPFYQK 
YSSRTQHAI LYMNPUKINLDLI LELLA YLD KS P QFRNI EGAVL I 
FLPGLAHIQQLYDLLSNDRRFYSERYKVIALHS ILSTQDQAAAF 
TLPP PG VR KI VIiATNIAETG I T I PDWF VI DTGRTKENKYHES S 
QMSSIiVETFVS KASALQRQGRAGR VR DGFCFRM YTRER FEGFMD 
YSVPEILRVPLBELCLHIMKCNLGSPBDFLSKAliDPPQLQVJSN 
AMNLLRKIGACBLNEPKLTPLGQHLAALPVNVKIGKML IFGAIF 
GCLDPVATLAAVMTEKS PFTTP IGRKDEADLAKSALAMADSDHL 
TI YNAYLGWKKAROEGG YRS E I T YCRRU FLNRTSLIiTLED VKQE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
CA^Alanine, C=Cysteiae, D=Aspartic Acid, E=» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K*Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, T=« Threonine, V- Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








LIKIiVKAAGFSSSTTSTSWEGNRASQTLSFQEIALLKAVLVAGL 
YDNVGKI I YTKS VDVTE KLAC I VETAQG KAQVHPS S VNRDLQTH 
GWLLYQBKIRYARVYLRBTTLITPFPVLLPGGDIEVQHRERLIiS 
IDGWI Y FQAP VKIAVI F KQLRVLIDS VLRKKLENP KMSLENDKI 
LQIITELIKTBNN 


6"l44 


1289 
> 


568 


SGPGSMSGQRVDVKWMLGKEYVGKTSLVERYVHDRFLVGPYQN 
VS AS GGARHGGRGS GG PVICT YGPDI»FPL VA\ TI G AAFVAKVMS 

FERAKFWVKELRSLEEGCQIYLCGTKSDLLEEDRRRRRVDFHDV 
QDYADNI KAQL FETS S KTGQS VDELFQKVASDYVSVAAFQVMTE 


6145 


1109 


196 


GGI4DLSELERDNTGRCRLSSPVPAVCRKEPCVLGVDEAGRGPVL 
G PMVYAI CYCPLPRLADLEALKVADSKTLLESERBRLFAKMBDT 
DFVGWALDVLS PNI»I S TSMLGR VKYNLNSLS HDTATGL I Q YALD 
QGVNVTQVFVDT VGMPE T YQARLQQS F PGI E VTVKAKADAL YP V 
\VSAAS I CAKVARDQAVKKWQFVEKLQDLDTD YG\SG YPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LBKEAEDVIR 
EDSASENQEGLRKITSYFLNEGSQARPRSSHRYFLERGLESTTS 
L 


6146 


428 


781 


LKXKGKEKAEAQQVEAI/PGPSLDQWHRSAGEEEDGPVLTDEQKS 
R/ YPGHEAHDQGG\WDARQS I IRKWDPETGRTRLi KGDGBVLE 
BIVTKERHRB INKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPP PS PGSGPGDS PEGPEGEAPERRRKAHGMLKIiYYGLSE 
GEAAGRPAGPDPIiDPTDLNGAHFDPBVYLDKLRRBCPLAQLMDS 
ETDMVRQIRALDSDMQTLVYENYNKFIS ATDTI RKMKNDFRKME 
DEimRLATNMAVITDFSARISATLQDRHERITKIiAGVHALLRKL 
QFLFELPSRLTKCVELGAYGQAVR YQGRAQAV1*QQ YQHLP S FRA 
IQDDCQVITARLAQQLRQRFREGGSGAPEGAECVELLIALGEPA 

SGFVGGLCQVAAAYQELFAAQGPAGAEKLAAFARQLGSRYFALV 
ERRLAQEQGGGDNSI^VPJVLDRFHRRLRAPGALLAAAGLADAAT 
E I VERVARERLGHHLQGLRAAFLG CLTD VRQAIAAPRVAGKEG P 
GLAELLANVASS I LSH IKASLAAVHLFTAKEVS FSNKPYFRGSF 
CSQGVREGLI VGFVHSMCQTAQS FCDS PGEKGGATPPALLLLLS 
RLCLD YETAT IS Y I LTLTDEQFLVQDQFP VTP VS TLCAE ARETA 
RRLLTHYVKVQGLVISQMIAKSVETRDWLSTLEPRNVRAVMKRV 
VEDTTAIDVQVLPRIiAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDMCIWASHGASSVARASVREPQGNKSPRMNTKRAGECLCPRS 
CS FSAQD YDIFAP I L PVEKQRLRVTQE VRAGL VXVLKI RPQTNS 
CILPLPHSTGSINSDHVPTK 


614B 


30*6 


353 


vpavggtfadgamgeaekfh yi yscdld invqlki gslegkreq 
ksykavledpmiikfsglyqetcsdlyvtcqvfaegkplalpvrt 
s ykafstrwnwnem lklpvkypdlprjsaqvalt i wd vygpgkav 
pvggttvslfgkygmfrqgmhdlkvwpn crsqm dqkptktpgrt 
ssti>sedq>^la:<ltkahrqghmvkvdwldrltfreiemines 
vkrssnfmylmggfrcvkcddkeygivyyekdgdesspiltsfe 
lvkvpdpqmslenlveskhhnlprslrsgpsdhdlkpypsprdq 
lknivsyppskpptyeeqdlvwefryyltnqdkaltkiltsviw 
dlpqgakqaiiallgkwkpmdvedsleliisshytnptvrryavar 
lrqaddedllmyllqlvqalkyenfddikngleptkicdsqssvs 
envsnsginsabidssqiit/sapfpsvssppp\asktkevpdg 

ENIiEQDLCTFLI SRAS KNSTLANYLYWYVI VECEDQDTQQRDP K 
THEMYLNWRRFSQALLKGDKSWVMRSLWW^QQTFVDRLVHLM 
KAVQRESGNRKKKNERLQALLGDNEKMNLS DVELIPLPLEPQVK 
1RG1IPETATLFKSALMPAQLFFKTEDGGKYPVIFKHGDDLRQD 
QLILQIISL^KLLRKENLCLKLTPYKVLATSTKHGFMQFIQSV 
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SKQ 
ID 
NO: 


| Predicted 
bftginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end - 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptic^" 
(A-Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanina, G=Glycine, 
H=Histidine, I«Isoleueine, IfcLysine, 
L»Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, R=>Arginine, 
S*Serine, To Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X« Unknown, *»Stop 
Codcn, /^possible nucleotide deletion, 
\eposoible nucleotide insertion) 








P VAEVLDTEGS I QNFFRKYAPS ENG PNG IS AEVM DTYVKSCAGY 
CVTTYI LG VGDRHLDNL LliTKTG KLFH I DFGYI LGRDPKPLP P P 
M KLNKEMVEGMGGTQS EQ YQEFRKQC YTA KLHLRRYSNL ILNLF 
SLMVDANIPDIALEPDKTVKKVQDKPRLDLSDEEAVHYMQSLID 
BSVHALFAAWEQI HKFAQ YWRK 


6149 


l 


1413 


R VDPRVRENGTANP I KNGKTS PAS KD QRTG KKTS VQGQ VQKGND 
BSES DPE5DP PSPKS SE EEEQDDEEVLQGEQGDFNDDDTEPEK L 
GHRPLLMDSEDEEEEEKHSSDSDYEQAKAKYSDMSSVYRDRSGS 
GPTQDLNTI liLTSAQLSSD VAVETPKQE FD VFGAVPFFAVRAQQ 
PQQEKNEKNLPQHRFPAAGLEQEEFDVFTKAPFSKKVNVQECHA 
VGPEAHTIPGYPKSVDVFGSTPFQPFLTSTSKSESNEDLFGLVP 
FDE I TGSQQQKVKQRSLQKLS SRQRRTKQDMSKSNG KRHHGTP' r 
STKKTLKPTYRTPERARRHKKVGRRDSQSSNEFLTISDSKENIS 
VALTDGKDRGNVEiQPEESLLDPFGAKPFHSPD\l^WHPP\HQGIi 
S\DIRADHNT\VLPGR\ PRQNSLHGSFHSADVLKMDDFGAVP /F 
LTELWQS ITPHQSOQSQPV\ELDPFGAAPFPSKQ 


6150 


372 


37 


MSN I KKY 1 1 D YD W KAS I B I E IJDHD VMTEEKii&Q INNF WSD SE YR 
LN KHGSVLNAVL IMLAQHALLI AI SSDLNAYGVVCEFDWNDGNG 
QE GWPPMDGS EGIRITD I DTSG I F 


6151 


1555 


521 


DSNQQS VSGTAASTLLHS FKATI YYQGTGHVQQF YG VTS PYSQT 
TP P I VQS YAQPS LQYI QGQQI FTAHPQG WVQPAAAVTT I V APG 
QPQPLQPSEMVVTNNLLDLPPPSPPKPKTIVLPPNWKTARDPEG 
KI YYYHVITRQTQWDP PT WES PGDDASLEHE AEMDLGTPTYDEN 
PMK\ASKKPKTAEADTSSELAKKSKEVFRKEMSQFIVQCLNPYR 
KPDCKVG \R I TTTEDFKHLARKLTHGVMNKELKYCKNPB\ DLEC 
NENVKHKTKBYIKKYKQKFGAVYKPKBDTEFRVTVGPGWEDGWS 
GKTDSRERKSCGPFCSTPVSTVLLMIHHPGEFNPADVN 


6152 


1366 


648 


NRTWSTPSTWMGVALPPLCSTGPWPVTRQITARTTCGAVPAKCP 
PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 
GPSNCSQHGLCTETGCRCDAGWTGSNCSEECPLGWHGPGCQRPC 
KCEHHCPCDPKTGNCSVSRVKQCLQP PEATLRAGELSFFTRTAW 
LALTLAIAFLLLI STAANLSLLLSRAERNRRIiHGDYAYHPLQEM 
NGEPLAAEKEQPGGAHNPFKD 


6153 


2 


3368 


GRVGARSPGRAYALLLLLICFNVGSGLHLQVLSTRNENKLLPKH 
PHLVRQKRAWITAPVALLEGEDLSKKNPIAKIHSDIiAEERGLKI 
TYKYTGKG ITE P P FG I FVFNKDTGELNVTS ILDREETPFFLLTG 
YALDARGNNVEKPIi ELRI KVLDI NDNEPVFTQDVFVGSVEELS A 
AHTLVMKINATDADEPNTLWS KIS YRI VSLEPAYPPVFYLNKDT 
GE IYTTS VTI£)REEHSSYTLTV3ARDGNGEVTDKP VKQAQVQ IR 
I LD VNDN I P WENKYL EGMVE ENQVNVE VTRI KVFDADE I GSDN 
WIiANFTFASGNEGGYFHIETDAQTNEGIVTLIKEVDYEEMKNLD 
FSVIVANKAAFHKSIRSKYKPTPIPIKVXVKNVKEGIHFKSSVI 
S I YVSESMDRSSKGQ I IGNFQAFDEDTGLPAHAR YVKLEDRDNW 
I SVDS VTSEIKIiAKLPDFESRYVQNGTYTVKIVAI SEDYPRKTI 
TGTVLINVEPINDNC PTLIEP VQTICHDAEYVNVTAEDIiDGH PN 
Jurcjf AJJ ^"«*^H&iWKiJuiQ^TSVLLQQSEKKIjGRSElQ 
FLISDNQG FS CPEKQVLTLTVCE VLHGS \ GCREAQHDS YVGLGP 
AAI ALM1 LAFLLLLLVPLLHiMCHCGKGAKGFTP I PGTI EMLHP 
WNNEGAP PEDKWPS FLP VDQGG SLVGRNG VGGMAKEATMKGS S 
SASIVKGQHEMSEMIX3RWEEHRSLLSGRATQFTGATGAI\MTTE 
TTI TARATGAS RDVAGAQAAAVALN EEFLKNY FTDKAAS YTEED 
ENHTAKDCLLVYSQEETESLNAS IGCCSFIEGELDDRFLDDLGL 
KFKTLABVCLGQKIDINKEIEQRQKPATETSMNTASHSLCEQTK 
VNSENTYSSGSSFPVPKSLQEANAEKVTQEIVTERSVSSRQAQK 
VATPLPDPMASRNVXATETSYVTGSTMPPTTVILGPSQPQSLIV 
reRVYAPASTLVIX?PYANEGTVVVTERVIQPHGGGSNPLEGTQH 
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SEQ ■ 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amano acid segment containing signal peptide 
(AeAlanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
SaSerine, ToThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /possible nucleotide deletion, 
\»possible nucleotide insertion) 








t^DVPYVMVRER2SFLAPSS3VQPTIiAMPNIAVGQNVTVTSRVI» 
APASTLQS S YQ I PTENSMTARNTTVS GAGVPG PLPDFGLE SSGH 
SNSTI TTSSTRVTKHSTVQHSYS 


6154 


3660 


2146 


FCKKTKMKNTLQKTVNFGAWPKPTI SDKSH LLQMVS KLDLTDAKN 
SDTAHI KS I E I TS I LNG LQASES 3AEDSEQBDERGAQDMDNNGK 
EESKI DHLTNNRNDL ISKEEQNS SSLLEENKVHADLVIS KPVSK 
SPERLRKD1 EVLSEDTDYEEDEVTKKRKDVKKDTTDKSS KPQIK 
RGKRRYCNTEECLKTGSPGKKEEKAKNKESLCMENSSNSSSDED 
EEETKAKMTPTKKYNGLEEKRKSLRTTGFYSGFSKVAEKRIKLL 
NNSDER LQNSRAKDRKDVWS S IQGQN P KKTLKELFSDS DTE AAA 
SPPHPAPEEGVAEESLQTVAEEESCSPSVELEKPPPVNVDSKPI 
EEKTV E VNDRKAE PPSS G SNFSA* I PLPYLHLNRLHQSL *QKGS 
RQQS S VTVS EPLAPNQEE VRS IKSETDSTIBVD S VAG EIjQDIjQS 
ERE* LASRF * CQCELEQ * * S ARTRTS * KSL YRS EKSERC SGRRK 
P I KKAE KKP * SNS GKQQKEG K 


6155 


669 


121 


HLLP ELR3KS W ITMK YVFYLGVLAGTFFFADS S VQKE DPAPYXV 
YLKSH FNPCVGVLI KPSWVLAPAHCYLPNLKVMLGNPKSRVRDG 
TEQTINPIQIVRYV^SHSAPQDDLMLIKLAKPAMLNPKVQALN 
P \ PTTNVRPGTVCLLSGLDW SQEKSGRHPDLRQNLEAP VMS DRE 
CQKTEQGK5HRNSLCVKFVKVFSR I FGEVAVATVI CKDKLQGIE 
VGHFMGGDVG I YTNVYKYVSWI ENTAKDK 


6X56 


5725 


3934 


GTST VTMATKKHFS III^LIjGMLLKXDNQDTRKLIiMTWALEVAV 
VMKKSET YAPLFCLPS FHKF CKGLLADTLVEDVll I CLQACSS LH 
ALS S SLPDDLLQRCVDVCR VQL VHRGTCIRQAFGIOliLKS I PLGV 
FLSNNNHTE IQ E I SIALRS HMSKAPSNTFHPQDFSD / V IS FI LY 
GNSHRTGKDKWLERLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW 
AIWEAAQFTVLS KLRTPLGRAQDTFQTI EGI IRSLAGHTLNPDQ ! 
D VS QWTTADTOEGHGtWQLRLVLLLQ YLENLEKLM YNAYEG CAN i 
ALTS P P KVIRT FL YTNRQTCQDWLTR I RLS IMRVG LLAGQPAVT i 
VRHGFDLLTEM KTTS LSQGNELEVS IMM WBALCELHCPEAI QG 
IAVWSSSIVGKHLLWINSVAQOAEGRFEKASVEYQEHLCAMTGV 
DCC IS S FDKS VLTLAS AGCKS ASLKHCLNGESRKS VLSKPTDSS 
PEVINYW3NKACECYISTADWAAVQEWQNAIHDLKKSTSSTSLN 
LKAD FNY IKS LSS FESGKFYECTEQLELLPGEN3 NLLAGGS KEK 
IDMKKLLRNM 


6157 




329 


MANRGPS YGLSREVQBKIEQKYDADLENKLVDWI I LQCAB0IEH 
PPPGRAHFQKWLMDGTVLCKLINSLYPPGQEPIPKISESKMAFK 
QMEQI SQFLKAAETYGVRTTDIFXJTVDLWEGKDMAAVQRTLMAI, 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRG7SEEQLRQGQNVIGL 
QMGSNKGASQAGMTGYGMPRQIM* DAASCP 


6159 


441 


1482 


LGSLIVLSLHCKVI PSSQSLERAMKEKAVDLVPILAQNPGLAQN 
P ILEGKDHNQNTG VD P 1 1 DHVQDRKTD/S RSKS PHKKRS KSRER 
RKSRSRSHSRDKRKDTREKI KEKERVKE KDREKEREREKEREKE 
KERGKNKDRDKEREKDREKDKEKDREREREKEHEKDRDKEKEKE 

UUAOiNCiK&iwKDAAl ISXUUCAJUJJVAoAC 1 r fKblJNiioKRSRSSSRE 

RRRRRSRSSSRS PRTSKT1KRKSSRSPSPRSRNKKDKKREKERD 
HISERR£RERSTSMRKSSNDRDGKEKLEXNSTSLKEKEHNKEPD 
SSVSKEVDDKDAPRTEENKIQHNGNCQLNEENLSTKTEAV 


6159 


53 • 


84 


AVIAPLHI S LGDRARPYLKNTEKSSTTCSRRRNQS FPP VMSLTH 
RLHLCKYWGCAVSWCRFWEGRPLPLMIVVPYTIiPVSLPVGSCV 
1 1 TGTP I LTFVKDPQLE VNFYTGMDBDS D I AFQFRLHFGHPAIM 
NSCVFGIWRYEEKCYYGPFEDGKPFELCIYVRHKEYKVMVNGQR 
tYNFAHRFPPAS VKMLQVFRD I SLTRVL ISD*GRC VRITAVQEF 
DVS VS CD CTTAYQPG 


6160 


1626 


1790 


AGAKFFP * F* KVADAQPTESEKEI YNQVNWL»KDAEGILEDLQS 
YRGAGHBIREAIQHPADEKLQEKAWGAWPLVGKLKKFYEFSQR 
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SEQ 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firBt 
amino acid 
residue of 
amino acid 
seguence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F- Phenylalanine, G«=Glycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, [^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
SaSerine. ToThreonine, V«=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *-Stop 
Codon, /«possible nucleotide deletion, 
\«possible nucleotide insertion) 








LKAAT.RGLLGALTSTPYSPTQHLEREQALAKQFABIJjHFTLRFD 
ELKMTNPAIQNDFSYYRRTLSRMRINNVPAEGENBVNNELANRM 
SLFYAEATPMLKTLS DATTKFVSENKNLP I ENTTDCLSTMASVC 
RVMLETPEYRSRFTWEKTVSFCLRVMVGVIILYDHVHPVGAFAK 
TS KI DMKGCI KVLKDQ P PNS VEGLLNALRYTTKHLNDETTSKQI 
KSMLQ*QLLTLVNKG 


6161 




j 1569 


P VSGSES SLRRAWAS 1 LRLMLG PRVAVS I LCEDGISH* LLEKH* 
KS H VLEP LS SLALE EQCLALS LDWS TGKTGRAGDQPLK 1 1 S SDS 
TGQLHLLMVNETRPRLQKVASWQAHQFEAWIAAFNYWHPEIVYS 

ggddgllrgwdtrvpgkflftskrhtmgvcsiqssphrehilat 
gs ydehillwdtrnm kq pladt pvqggvwr i kwhp fhhhlliaa 
cmhsgfkilncqkameerqeatvlt3htlpdslvygadwswllf 
rslqrapswsfpsnlgtxtadlkgaselptpchecredndgegh 
arpqsgmkpltegmrkngtwlqataattrdcgvnpeeadsafs l 

LATCS F YDHALHLWE WEGN 


6162 


1 


586 


RTIHATGRAGAS PMHRL I VWRIAEANKQHVRCQKCLEFGHWTYB 
CTGKRKYLHRPSRTAELKKALKEKBNRLLLQQS IGETNVERKAK 
KKRSKSVTSSSSSSSDSSAS DSSSBS EETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSS$SKQ*HQHR*QL*R+TTKEB 
EKEIELLHSYWTDGLKTLM 


6153 


1081 


785 


RIRSTTEGCAVRLHPTQNTGKARIMI LLSVS LGRHWAFTYKFFL 
TPWFVFFPPFFHRKE*VMQKNPMKSREDEWMEKLNNLHVQRAD ! 
MNRIi I MNYLVTEGFKEAAE KFRMESG I BPSVDLETLDER I KI RE 
M I LKG Q I QEA I AL I NSLHP E LLDTNR YLYFHLQQQHL IEL I RQR 
ETEAAIiEFAQTQLAECXSEBSRECLTEMERTLALIAFDSPEESPF 
GDLLHTMQRQKVWSEVNQAVIX>YENRESTPKLAKLLKLIJ»MAQN 
ELDQKKVKYPKMTDLSKGVI EEPK 


6164 


90 


406 


PCQS PGRS RMRQDKLTG SLRRGGRCLKRQGGG VGT I LSNVLKXR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARLSTGI PKESfRRKVWLTLADHYLHS I AIDWDKTMRFTFNERS 
NPDDDSMGIQ I VKDLHRTGCS S YCGQE AEQDRWLKRVLLAYA& 
WNKTVGYCQGFNILAALILEVMEGNEGDALKIMIYLIDKVLPES 
YFVNNLRALSVDWAVFRDLLRMKLPELSQHLDTLQRTANKES GG 
G YEP PLTNVFTMQWFLTLFAT CLPNQT VLKI WDS VP FEGSE 1 11* 
RVSLAI WAKLGEQ I ECCETADE FYS TMGRLTQEMLENDLLQSHE 
I^QTVYSMAPFPFPQLAELREKYTYNITPFPATVKPTSVSGRHS 
KARDS DEENDPDDEDAWNAVG CLG P FSGFLAPELQKYQKQ IKE 
PNEEQ SLRSNNIAE LS PGAI NS CRSB YHAAFNSMMMERMTTD I N 
ALKRQYSRIKKKQQQQVHQVY IRADKGPVTSILPSQVNSSPVIN 
HLLICKKMK^f^NRAAKNAVIH IPGHTGGKISPVPYEDLKTKLJTS 
PWRTH IRVHKKNMPRTKSHPG CGDTVGLI DEQNEASKTWGLG AA 
E A FPS GCTATAGREGSS PEGS TRRTI EGQS PEPVFGDADVDVS A 
VQAKLGALELNQRDAAAETBLRVHPP CQRHCPEP P SAPE ENKAT 
S KAPQGSUS KTP I FS P FPS VKPLRKSATARNLGLYGP TERTPT V 
nr rv"oi«.or ot\h%jt3t»w boi 1 " Jwivr o IFluoKy.Lii'tiX PQEYQRN 
GGERFG 


6165 


SO 


405 


PCQS PGRSRMRQDKLTGSLRRGGRCLKRQGGGVGTI LSNVLKKR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARLSTGI PKEWRRXVWLTLADHYLHS I AIDWDKTMRFTFNERS 
NPDDDSMGIQIVKDLHRTGCSSYCGQEAEQDRVVLKRVLLAYAR 
WNKrVGYCQGFWILAALI LEVMEGWEGDALKIMIYLIDKVLPES 
Y FVNNLRALS VDMAVFRDLLRMKL PELSQHLDTLQRTANKE SGG 
GYEPPLTNVFTMQWFLTLPATCLPNQTVLKIWDSVFFEGSEIIL 
RVSLAIWAKLGEQI ECCETADEFYSTMGRLTQEMLENDLLQSHE 
LMQTVYSM APFP FPQLAE LREKYT YNITPFPATVKPTS VS GRHS 
KARDSDEBNDPDDEDAWNAVGCLGPFSGFLAPELQKYQKQIKE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
air.ino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=I»eucine, M=Methionine, N=Asparagine, 
P» proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W^Tryptophan, Y-Tyroeine, X*=Unknown, *=Stop 
Codon, /*=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNE EQS LRSNN I AELS PGAINS CRS E YHAAFNSMMMBRMTTD I N 
ALKRQYSR I KKKQQQQVHQVYIRADKGPVTS I L PSQ VNSS P VI N 
HLLLGKKMKMTNRAAKNAVIHI PGHTGGKI S P VPYEDI>KTKtiNS 
PWRTH I RVHKKNM PRTKS H PGCGDTVGLIDEQNEAS KTNG LGAA 
EAF PSG CTATAGREGSS P EGSTRRTI EGQS PEPVFGDADVDVS A 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPEP PS APEENTCAT 
SKAPQGSNSKT PI PSPPPSVKPLRKSATARNLGLYGPTERTPTV 
HF PQMSRSFSKPGGGNSGP * KMVFS SGTMTiS ft (YLT>czvvnv vat? nj 
GGERFG 


S166 


2 


1206 


HKLWRTVAMAGAEWKSLEECLEKHLPLPDLQEVKRVLYGKELRK 
LDLPREAFEAASREDFELQGYAFEAAEEQLRRPRIVHVGLVQNR 
I PLPANAPVAEQVSALHRRI KAIVEVAAMCGVNI ICFQEAWTMP 
FAFCTRE KLP WTEFAESAEDG PTTRFCQKLAKNHDM WVS PILE 
RDSEHGDVLWNTAVV ISNSGAVLGKTRKNHI P R VGDFNE S T YYM 
EGNLGHP VFQTQ FGR IAVNI CYGRHKPLNWLM YS I NGAE 1 1 FNP 

SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHODFGYFYGSSYVAAPn^QRTDrT «IP«:DTVT T UAVT tyt 

NLCQQVND VWNFKMTGRYEMYARELAEAVKSNYS PTI VKE * PAS 
VPALG 


" 6167 


1220 


1844 


YGIVTGPSLCAGDKQPKKQEKNPVLVSPEFVDEAIiCACEEYLSN " 
LAKMD IDKDLEAPLYLTPEGWSLFLQRYYQWHEGAELRHLDTO 
VQRCEDILQQLQAWPQIDMEGDRWIWIVKPGAKSRGRGIMCMD 
HLEEMLKLVNGNPWMKDGKWWQKY IERPLLI FGTKFDLRQWF 
LVTDWNPLTvWF YRDS YI R FSTQP F£ LKNLDK* AP LYLTPEGW S 
LFLQRYYQVVHEGAELRHI.DTQVQRCEDILQQLOAWPQIDMEG 
DRNI W I VKPGAKS RGRG IMCMDHLEEMLKLVNGNPWMKDGKWV 
VQKY I ERP LLI PGTKFDLRO WFTjVTDWNP tvfvwfyp r>Q v t p v « t 
QPFSLKNLDK 


6168 


84 


1332 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGLKNKK 
GAKQQKFI KAVTI IQ VKFGQQNPRQVAQSEAEKKLKKDDKKKE LQ 
ELNEIjFKP WAAQKISKGAD PKS VVCAFFKQGQCTKGDKCKFSH 
DLTLEPJKCEiCR5VYIDAIU)EELEKDTMDNWDBKKLEEVVNKKHG 
EAEKKKPKTQIVCKHFLEAIENNKYGWPWVCPGGGDICMYRHAL 
PPGFVLKKKKKKKKKEDE ISL*DLIERERSALGPNVTKITLES F 
LAHKKRKRQEKIDKIiEQDMERRKADFKAGKALVISGREVFEFRP 
ELVNDDDEEADDTRYTQGTGGDE VDDS VSVND IDLS LYI PRDVD 
ETGITVASLERFSTYTSDKDEHKLSRASGGRAEHGERSDLBEDN 
EREGTENGAIDAVPVDENLFTGEDLDELEEBLNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPtjAVITRIIKEALPDGVNISKEARSAISR 
AAS VFVL YATS CANNFAMKG KRKTLNAS DVLSAM E E ME FQR F VT 
PLKEALEAYRRBQKGKKEASBQKKKDKDKKTDSEEQDKSRDBDN 

DEDEBRIiEEEEQNEEEEVDN*KGRETVAPWKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRLAGCTVFITGASRGIGKAIAliKAAXDGANIVIA 
AKTAQPHPKLLGTIYTAAEEIEAVGGKALPCIVDVRDEQQISAA 
VEKAIKKFGG IDI LVNNASAI SLTNTLDTPTKRLDLMMNVNTRG 
TYLASKACI PYLKKSKVAHI PNISPPLHLNPVWFKQHCGRW-* W 
G* GDGLCLI CFELNLCMSDVI TI CT 


6171 


382 


941 


HFMQSDVEIiDCDIEPCXJHTKFPPTLPLSTTVIVCSCHPVATAST 
I4AEAFSKTTSEEDQSIQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGDIFGDSFAAYPPRVLKQVHQALSLSQEAVSVMDSMVRDILD 
R IATEAGHLAHYSXCVTITSIiDIRMA VCLLLPGKMGKLAESQGT 
NATLRYTKSK 


6172 " 


651 


54 


GLCRAGGAHRFSRTHVEAALKMIjRREARLRREYLYRKAREEAQR 
SAQERKBRlJfcRALEENRLI PTELRRE ALAliQGSLEFDDAGGBG V 
TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKELKLVFPGA | 
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SEQ 
ID 

NO: 


Pr edi cted 
beginning 
nucleotide 
location 
correeponding 
to first 
amino acid 
residue of 
amino acid 
sequence ' 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide" 
(AoAlanine, OCysteine, D»Aapartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valina, 
WaTryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








QRMNRGRHEVGAJUVRACKANGVTDIitiWHEHRGTPVGLIVSHLP " 
FGPTAYFTLCNWMRHDIPDLGTMSEAKPHLITHGFSSRLGKRV 
SDILRYIiFPVPKDDSHRVITFANQDDYISFRHHVYKKTDHRNVE 
LTE VGPRFELKLYM I RLGT LE Q EATADVE WR WH P YTNTARKR VF 
LSTE*AAPRPLGQI*L 




3 


288 


SVDHREVQVLSQSMPLTPHQAVLRGERPYMCVECGKCFGRSSHL 
LQHQRIHTGEKPYVCSVCGKAFSQSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDIAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
Rl HTGERPY VCPLCG KAFNHSTVLRSHQRVHTGEKPHRCNECGK 
TFSVKRTLLQHQRIHTGEKPYTCSECGKAFSDRSVL1QHHNVHT 
GBKPY^CSECGKTFSHRSTLMNHBRIHTEEKPYACYECGKAFVQ 
HSHLIQHQKVHRKL* PTCVLSVGSALAGVPTS FS IS VSTLERS P 
MCAVYVGRPSARAQSLVNTGQFTQVRSPMSVMSVEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGLGNPGLPGTRH5VGMAVLGQLARRLGVAESWT 
RDRHCAADLALAPIXjDAQLVIjLR prrlmnangrsvaraaelfgl 

taee vylvhdeldkplgrlalklggsarghng vrsci s clnsna 
mprlrvgigrpahpeavqahvlgcfspaeqellpllldratdli 
ldhirersqgpslgp*h*wfskka 


S175 


2204 


334 


RYFRADPRSRSGQPRAEGLGAFAEGPLRAMAAPVKGNRKQSTEG 
DALDPPASPKPAGKQNGIQNPISLEDSPEAGGEREEEQEREEEQ 
AFLVS LYKFM KERHTP IERVPHLGFKQINLWKI YKAVEKLGAY K 
LVTGRRIiVIKNWWT'rif2f?QD(TiCTQr2JVTr ,, PDDLrv*t>T \ it nvrmur v 

GEDDKPLPTSKPRKQYKMAKENRGDDGATERPKKAKEERRMDQM 
MPGKTKADAADPAPLPSQEPPRNSTEQQGLASGSSVSFVGASGC 
PEAYKRIXSSFYCKGTHGIMSP3LAI0CKLLAQVSKVEALQCQEEG 
CRHGAEPQAS PAVHLPE S PQSPKGLT3NSRHRLTPQEGLQAPGG 
SLREEAQAG PCPAAP I FKGCPYTHPTEVLKPVSQHPRDFFSRLiK 
DGVLLGPPGKEGLSVKEPQLVWGGDANRPSAFHKGGSRKGILYP 
KPKACWVSPMAKVPAESPTLPPTFPSSPGLGSKRSLEEEGAAH5? 
GKRLRAVSPFLBCEADAKKCGAKPAGSGLVSCLLGPALGPVPPEA 
YRGTMLHCPLNFTGTPGPLKGQAALPFSPLVIPAFPAHFLATAG 

PSPMAAGLI^FPPTSF1)SAIJIHRLCPASSAWHAPPVTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


P^ALRAMAEVHVIGQIIGASGFSESSLFCKWGIHTGAAWKLtiS 
GVREGQTQVDTPQIGDMAYWSHPIDLHFATKGLQGWPRLHFQVW 
SQDSFGRCQLAGYGFCHVPSSPGTHQLACPTWRPLGSWREQLAR 
AFVGGGPQLLHGDTIYSGADRYRLHTAAGGTVHLEIGUiLRNFD 
RYGVEC * GTLP P TS PPSTPRTPSDGGG WHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAPYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHCYLDQIKRSDFLGFSGYSPHFVAISTKSEHKMQPSSMQQAL 
PSQ*PYWTDPRPALVPCCSHRPDVERSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGGIKGVARAASLVGRRRAGTGMALLLCLVCLTAALAHGCL - 
HCHSNFSKKFSF YRHHVNFKSWWVGDI PVSGALLTDWSDDTMKE 
LHLAIPAKITRBKLDQVATAVYQMMDQLYQGKMYFPGYFPNELR 
NI FREQVHI>1QNAI I ESRIDCQHRCGIFQY3TISCNNCTDSHVA 
CFGYNCESSAQWKSAVQGLLNY INNWHKQDTSMRPRSSAFSWPG 
THRAAPAFLVLPALRCLEPPHLANLSLEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRLWDWVPLACRSFSLGVPRLIGIRLTL 
PPPKWDRWNEKRAMFGVYDNIGI LGNFEKHPKELIRGPIWLRG 
WKGNELQRCIRKRKMVGSRMFADDLHNLNKRIRYLYKHFNRHGK 
FR* KRKLRTSEKAHLS PWRRETVLFPVRKRLCI FS VI KWGFFGI 


6180 


156" 


1833 


DHH ILKAAS TTHVCARGNI FAI PNTRCLB C* ATATPS S LECQN * 
SHI^LCPLPATTSGLTPNSMIPEKERQNIAERLLRVMCADLGAL 
S WSGKE FLKLAQTLVDSGAR YGAFS VTB ILGNFNTLALKHLPR 
MYNQVKVKVTCALGSNACLGIGVTCHSQSVGPDSCYILTAYQAE 
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i SBQ 
ID 
MO: 


Predicbed 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alaninc, C«Cyateine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L= Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutacnine, lUArginine, 
S=Serine, TVThreonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=tJnknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








GNHIKSYVLGVKGADIRDSGDLVHHWVQNVLSEFVMSEIRTVyV " 

TDCRVSTSAFSKAGMCLRCSACALNSWQSVLSKRTLQARSMHE 

VIEIiLNVCEDLAGSTGLAKBTPGSLEETSPPPCWNSVTDSLLLV 

HERYEQICEFYSRAKKMNLIQSLNKHLLSNLAAILTPVKQAVIE 

LSNESQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHtiFIiEAL 

KENFKVHPAHKVAMILDPQQKLRPVPPYQHEEIIGKVCELINEV 

KES WAEBADFE P AAKKP RSAAVEN P AAQ E DDRLG KNE VYD YLQ B 

PLFQATPDLFQYWSCVTQKHTKLAKLAFWL1AVPAVGARSGCVN 

MCEQALLIKRRRLLSPEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPLALPAWLOPRYRKNAYLFI 
YYLIQFCGHSWIFTNMTVRPPSFGKDSMVDTFYAIGLVMRLCQS 
VSLLELLH I YVG I ESNHLLPRFLQLTER X 1 1 LFWI TSQEEVQ E 
KYWCVLFVTWNLLDMVKYTYSMLSVIGISYAVLTWLSQTLWMP 
I Y P LC VIAEAFAI YQS LP YFES FGT YS TKLPFDLS I YFP YVLK I 
YLMMLFIGM YFTYSHI»YS ERRD T LGT FP I ncit KM* «; t n urvrvrD 
KDRLW I QCS K* NTGS I LVEKFL VF 


6182 


1769 


1224 


AS*1DYQLNTLLKEFQI*TEENTKLRYLTCSL1EDMAAAYFPDCI " 
VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLMEFQ 
VKNVP SERIATQ KI LS VLGECLD1IPG PGCVGVQKILNARCPLVR 
FSHQASGFQ CDLTTNNR I ALTSfl ELL Y I YGAIiDSRVRALVFSVR 
CWARAHSLTSS IPGAWITNFSLTMMVIFFI^JRR^PP TT.PTr.nQT 
KTLADAEDKCVIEGNNCTFVRDLSRIKPSQNTETLBLLLKEFFE 
YFGNFAFDKNS INI RQGREQNKPDSS PL YI QNP FETS LNI S KNV 
SQSQLQKFVDLARESAWITjQQEDTDRPSISSNRPWGLVSLLLPS 

apnrksftkkksnkfaietvkklleslkgnrtenftktsgkrti 

STQT 


6183 


1118 


452 


hldryikspgsgsstpappshlllyllhpqstrtmgccgcsrgc 
gsgcggcgsscggcgsgcggcgsgrggcgsgcggcssscggcgs 
rcyvpvccckpvcswvpacsctscgscggskggcgscggskggc 
gscgcsqsscckpcccssgcgssccqsscckpcccqssccvpvc 
cqsscckpcccqsttccvpvccqcxi * gsgprps gfs clvkaflm 

VP 


6184 


1 


2191 


IVTVREEDGAPAVAPPGVWSRANKRSGAGPGGSGGGGARGAEE 
EPPPPLQAVLVADSPI)RRFFPISKDQPRVLLPLANVAL1DYTLE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRI its 
ELYRSLGDVLRDVDAKALVRSDFLLVYGDVISN IN ITRALEEHR 
LRRKL* KNVS VMTMIFKESS PSHPTRCHEDNVVVAVDSTTNRVL 
HFQKTQGLRRFAFPLSLFQGSSDGVEVRYDLLDCHISICSPQVA 
' QLFTDNFDYQTRDDFVRGLLVNEEILGNQIHMHVTAKEYGARVS 
NLHMYSAVCADVIRRWVYPLTPBANFTDSTTQSCTHSRHNIYRG 
PBVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIEPGD 
NVVLDQTYLWQGVRVAAGAQIHQSLIiCDNAEVKERVTLKPRSVL 
rSQVWGPNITLPEGSVISLHPPDAEEDEDDGBFSDDSGADQBK 
DKVKMKGYNPAEVGAAGKGYLWKAAG^fNMEEEEELOQNLWGLKI 
NMEE B S ES E S EQSMDSEEPDS RGGS PQMDDI KVFQNE VLGTLQR 
GXEENISCDNLVLEINSLKYAVNISLKEVMQVLSHWLEFPLQQ 
MDSPLDSSRYCALLLPLLKAWS PVFRNYIKRAADHLEALAAIED 
FFIjEHEALGISMAKVLMAFYQLEIIjAEETILSWFSQRDTTDKGQ 
QLR KNQQLQRFI Q WLXEAEEESSEDD | 


6185 


791 


44 


PCTS C VLWATLHL PAS XRKAP QAECGM IS I TEWQKI GVGITG FG ' 

IFFILFGTLLYFT>SVLLAFGNLLFLTGLSLIIGLRKTFWFFFQR 

HKOKGTSFLLGGVVIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 

GFLGNVCN I PFLGALFRRLQGTSS MV* KTEMSSLNLDHPTLKGAK 

REBWEPPPQSPALTHSPTYPGPPQVQKERNGAEQLTSNPQVDSR 

GCQEAEMQTPRRLGWGWYHTLTLYLWEEK 


6186 j 


569 


238 


VYGIDSSNTNTHaAEER^KLKKHWKLCHAQSRLDVNGI^Kf^ 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
c o r re sp ondi ng 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide " 
(A-Alanine, C=Cyateine, D=Asparcic Acid, E= 
Glutamic Acid, F=* Phenyl alanine, G°Glycine, 
HeHistidine, I«Isoleucine, K=I*ysine, 
L=»Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=»Arginine, 
SeSerine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








KSRKVKNKVKNKADTEEVPNNSPTNQEKMPTSAILPDFSGSVIS" 
NIRNQMETLHSQPHQEENLCFEWSFSLINLLPINAVEPTSSQQI 
PNRETSEANKBRRKMTSKSSESWIYSPLTSFTTADSELHDIIKD 
LEDCLMVGLHTCGDLAPNTLRIFTSNSE I KGVCSVGCCYHLLSB 
EFENQH KERTQEKWG P PMCHYLKB ERWCCGRNARMS ACLALBRV 
AAGQGLPTESLPYRAVLQDIIKDCYGITKCDRHVGICIYSKCSSP 
L»DYVRRS LXKLGLDES KLPBKI I MNYY EKY KPRMNE LEAFNM L X 
WLAPCIETLILLDRLCYLKEQEDIAWSALVKLPDPVXSPRCYA 
VIAGKKQQ * FPL KQ 1 1 RCISL * DS AGCAE EVS VGDGG PALRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRU^ILNPDSFIEPRPGRLPELEATRPHMEPkASCPA 
AAPLMER K PHVLVGVTGS VAALKL PLL VS KLItDI PGLEVAWTT 
ERAKHFYS PQDI P VTLYSDADEWEMWKSRSD PVLHI DLRRWADL 
LIiVAPLDANTLG KVAS G I CPNLLTCVMRAWDRS KPLL FCPAMNT 
AMWEHPI TAQQVDQLKAFGYVE I PCVAKKLVCGDEGLGAMAEVG 
T1VDKVKEVLFQHSGFQQS*PGISVMGVPLYSEPTVQAKSVKMDV 

GKIGGYPHIjLNGGPALSIjPRGQACSRLNWTEGPGLSFFQPGEAA 
A 


6181 


238 


1534 


KG FVNAGP LMAELQ VS PQWKAP EMS QI CLS CGH PSA* G P R WAS W 
N1GVFICIRCAGXHRNLGVHISRVXSVNLDQMTQEQIQCMQEMG 
KGXA?*RLYEAYLPETFRRPQinpAVEGFIRDKYEKKKYMDRSLD 
I NAFRKEKDD KWKRGSE PVPEKKLE P WPB KVKMPQKKEt) PQLP 
R KSS P KS TAP VMDliIiGLDAPVACS IANS KTSNTLEKDLDLLASV 
PSPSSSGSRKWGSMPTAGSAGSVPSWLNLFPEPGSKSEEIGKK 
QtiSKDS II»S LYGSQTPQMPTQAMFMAPAQMAYPTAYPS FPGVTP 
PNS I MGSMMP PP VGM VAQPGASGMVAPMAM PAG YMGGMQASMMG 
VPNGMMTTQQAGYMAGMAAMP<nVYGVQPA(MLQ>WLTQMTQQM 
AGMNFYGANGMMNYGQSMSGGNEQAANQTLSPOMWK 


6189 




793 


LGEPuGDLCELIPGDVQQLQMGBVHPGTGAQGSAAQSVAGEVQL 
TQLSHARQR PSCQGS QLIALDLQHMDI SRQPR WQHVOPVAROVQ 
RAQQAQLABGVAVHLWAGDAWAEVELLGEVGGGKVFAANACDL 
WQDHEGA / HAARQATGHALQRVIVQVRRVQPLEA1#*RVPSGLPR 
RVRAFMILHNQITGIGREDFATTYFLEELNLSYWRITSPQVHRD 
AFRKLRLLRSLDLSGWPXHMLPPGLPRNVHVLKVKRNEIAALAR 
GALAGMAQLRELYLTSNRLRSRALGPRAWVDLAHLQLLDIAGNQ 
LTEIPEGLPESLEYLYLQUNKISAVPANAFDSTPNLKGI FLRFN 
KLAVG S VVDSAFRRLKHLQ VLDIEGN LEFGD IS KDRGRLGKEKE 
BEEEDEVEEEETR 


6190 


66 


1309 


ILVGWSFLLSFAEYVCNCSVVGSLNVNRCNQTTGQCECRPGYQ 
GLHCETCKEGFYLNYTSGLCQPCDCS PHGALSI PCNSSGKCQCK 
VGVIGS ICDRCQDGYYGFSKNGCLPCQCNNRSASCDALTGACLN 
CQENS KGNH CE ECKEGFYQS PDAT K E CLRCPCS AVTSTGS CS I K 
SSELEPECDQCKDGYIGPNCKKCENGYYNFDSICRKCQCHGHVY 
PVKTP K ICKPE SGE CINCIiHNTTG F WCENCL* G YVHDtiEGNCI K 
KVILPTPEGSTILVSNASLTTSVPTPVINSTFTPTTLQTIFSVS 
TSENSTSALAJDVS WTQFNI I ILTVI 1 1 VWLLMGFVGAVYMYRE 
YQNRKLNAP FWTI ELKEDN I S FSS YHDS I PNAD VSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


tfisi 


1212 


iSii - 


VNL CHGGLLhI>S THHLG I KPSMH* LFFLMLS FPHLTPQQ P KCPS*~ 

MIDKIKOWYIYTMEYYATIKRNEIMFFAGTWMEMEAIILSKLM 
QDYMFS LI SGS 


6192 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAGI EA\ras^AEE^~ 
KGGLVS DAYGEDDFS RIiGGDEDG YEBE EDBNSRQS EDDDS ETEK 
PEADDPKDNTEAEKRDPQELVASFSBRVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYERKIKEGMDMNYIIQRKKEPRNPSIYEKLI 
QFCAIDEIX3TOYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
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S3Q 
ID 
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Predicted 

beginning 

nucleotide 

location 

cor r e 3pond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nuc 1 eot i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A»Alanine, C-Cysteine, D=Aspartic Acid, S= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H=Histidine, I»Iaoleucine, K=»Lysine, 
L=Leucine, K«Methionine, N=Asparagine , 
P»Proline, Q=Glut amine, R=»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=*Tryptophan, Y=Tyrosine, X= Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKERTKI E FVTGT KKGTTTNATS TTTTTAS TAVADAQKRKSKW 
DSAI P VTT I AQPTILTrTATLPAWTVTTSASGS KTTVT SAVGT 
IVKKAKQ 


6193 


3 

— * — 


950 


TRG CGNKMAG KKNV LS S LAW AED S E PES DG E AG I EAVG S AAE E " 
KGGLVSDAYGEDD FSRLGGDEDG YB 3EEDBNSRQSEDDDS3TE K 
P EADDPKDNTEAEKRDPQELVASPS ERVRNMS PDBIKI P PE PPG 
RCSNHLQDKIQKLYERKIXEGMDMNYI IQRKKEFKNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSBDSYYEALAKAQKIEMDKLEK 
AKKERTKIEFVTGTKKGTTTNATS TTTTTASTAVADAQ KRKS KW 

DSAI PVTTI AQPTILTTTATLPAVVTVTTSASGS KTTV ISAVGT 
IVKKAKQ 


6194 




950 


TRGCGNKMAG KKNVLSSLAVYAEDS EPESDGEAG IEAVGS AAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKI QKL YERKI KEGMDMN YI I QRKKEFRNPS 1 YE KLI 
QFCAI DEIiGTNYPKDMFD PHG WS EDS Y YE ALAKAQKI EMD KLEK 
AKKERTKIEFVTGTKKGTTTNATSTTTTTASTAVADAQKRKSKW 
DSAI P VTT1AQPTX LTTTATLPA WTVTTS ASGSKTTVI SAVGT 
IVKKAKQ 


6195 


736 


235 


VANGLQSNMPKFYCDYCDTYLTHDS PS VRKTHCSGR KHKENVKD 
YYQKWMEEQAQS hi DKTTAAFQQGKI P PTPFSAP P PAGAM I PPP 
PSL PGP PRPGMMPAPHMGGP PMMPMMGP PP PGMM PVGPAPGMRP 
PMGGHMPMMPGP PMMR PPAR PMMVP TRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRN ILDNAEQVl SNIiEARNIjGPRLTP LLQEEDSH 
QRLLMGLMVSELKDHFLRHLQG VE KKK I EQMVLD Y I SKLLDLI C 
HIVETNWRKKNI^SWVLHFNSRGSAAEFAVFHlMTRILEATNSti 
FLPLPPGFHTLHTILGVQCLPLHNLLHCIDSGVIiLLTETAVIRI, 

mkdldntekneklkfsiivrlppligqkicrlwdhpmssniisr 
nhvtrllqnykkqprnsminkssfsveflpletyfie i ltd iess 
nqalypfeghdnvdaefvbeaalkhtamllgl 


6197 


3 


819 


ADPEGTE2AVMSRYTRPPMTSLFIRNVADATRPEDLRREFGRYG 
P I VD VYI PLDFY7RR P RG FAYVQFEDVRDAEDALYNLNRKW VCG 
RQXE I Q FAQGDRKTPGQMKSKERHPCS PSDHRRSRSPSQRRTRS 
RSSS WGRNRRRSDS LKESRHRRFS YS Q S KSRSKS LPRR STS ARQ 
SRTPRRNFGSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 

RSHGRHSDSIARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALSPSFISPACFLIjRKLPALEDGTLPHPDTLGMNYEGARSE 
RENHAADD S EGGALDMCCSE RI .PGLPQPI VM EALDS AEGLQDS Q 
REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
LACGVLWFSG YGH1 WSQNATNLVSS LLTLLKQLEPTAWLDSGTW 
GVPSLLLVFLSGGLVLVTTLVWHLLRTPPEPPTPLPPEDRRQSV 
SRQPSFTYSEWMBEKIEDDFLDLDPVPETPVFDCVMDIKPEADP 
TS iiTVKSMGLQERRGSNVSLTLDMCTPGCZNEEGFGYLMS PRESS 
AR E YLLS AS R VLQAE ELHE KALD P FLLQ AE F FE I PMNF VD PK E Y 
DI PGLVRKNRYKTILPNPHSRVCLTSPDPDDPLSS Y I NANYTRG 
YGGEEKVYIATQGPI VSTVADFWRMVWQEHTPI I VKITNIEEMN 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEER 
G L JOTYWFTS W PDQKTPDRAPPLLHLVREVEEAAQQEG PHCAP II 
VHCSAG IGRTGCFI A T SI CCQQIiRQEG WDI LKTTCQLRQDRGG 
M IQHCEQYQFVHHVMSLYEKQLSHQS PE 


6199 


144 


1211 


MARENGESS^5WKKQAEDIKKIFEFKETLGTGAKSEVVLAEEKA 
TGKLFAVKCI PKKALKGKESSIENEI AVLRKIKHENI VALEDI Y 
ES PNHL YLVMQLVSGGEL FDR I VEKGFYTEKDASTLIRQVLDAV 
YYLHRKGI VHRDLKPENLLYYS QDEES KIMI S DFGLS KMEG KGD 
VMS TACGTPGYVAPEVLAQKPYSKAVDCWS IGVIAYTLLCG YPP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

1 Ofah 4 rvn 
AvWat XOU 

corresponding 
to first 
amino acid 
residue of 

sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C»Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenylalanine, G*»Glycine, 
H«Histidine, I=*Isoleucine , R=Lysine, 
L=Leucine, M»Methionine, N^Asparagine, 
P=Proline, Q=Glutaraine, R=«Arginine, 
S=Serine, T=Threonine, V»Valine, 
W oTryp t ophan , Y=Tyrosine, XsUnknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FYDENDSK^EQILKAEYBFDSPYWDDISDSAKDFIRNLMEKOP 
NKRYTCEQAARHPW I AGDTALNKNIHES VSAQI R KNFAKS KWRQ 

AFTIATAVVRHKRiU»Hl#GSSLDSSNAS VSSSJUSLASQKDCAS G^F 
HAL* 


6200 


702 


36 


LPEVPHSLRPRVKPHLCCAQPAVRVMARLPKLAVFDIiDYTLWPF " " 
tWDTHVDPPPHKSSDGTVRDRRGQDVRLYPEVPEVLKRIjQSLGV 
PGAAASRTS EIEGANQLIiELFDLFRYFVHRE I YPGS KITHFERL 
QQKTG I P FS QM1 FFDDERRNI VD VSKLG VTCI HI QNGMNLQTLS 
QGLETFAKAQTGPLRSSLEESPFEA 


6201 


2809 


2383 


GQT PR VR W KMR R S IiRAGKRRQTAGRKS KS P P K VP 3 V I QDDS L PA 
GPPPQIRILKRPTSnGWSSPNSTSRPTLPVKSLAQREAEYAEA 
PJCRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKMAPOKDRKPKRSTWRFNLDLTHPVE 
DG I FDSGNFEQFLREKVKVKGKTGNLGNWHIERFKNKITWS E 
KQFSKRYLKYLTKKYI*KKNNLRDWLRWASDKETYELRYFQISQ 
DKDESESED 


6203 


419 


2550 


R C PR PPATAGAAASR P DRS P PSG I SGSEAAAGAGAAAPASQH PA 
TGTGAVQTE AMKQI LGVI DKKLRNLEKKKGXIJ>D YQERMNKG ER 
LNQDQLDAVS KYQEVTNNLEFAKELQRS FMALSQDIQKT I KKTA 
RREQLMREEAEQKRLKTVLELQYVLDKLGDDEVRTDLKQGLNGV 
P I I»S E EE LS L LDEFYKLVDPERDMSLRLNEQYEHAS I HL WDLLE 
G KEK P VCGT T YKVLKE I VE RVFQ S N Y FDS T HNHQNGLCE E 3 BAA 
SAPAVEDQVPEAEPEPAEE YTEQS EVESTEYVNRQFMAETQFTS 
GEKEQVDEW7VETVEVVNSLQQQPQAASPSVPEPHSLTPVAQAD 
PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQrLDPAlVSAQPM 
NPTQNMDMPQLVCPP VHS ESRLAQ PNQVPVQPEATQVFLVSSTS 
EGYTASQPLYQPSHATEQRPQKEPIDQ IQATI SLNTDQTTAS SS 
IiPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFKMNAPVP 
PVNBPETLKQOWQYQASYWQSFSSQPHQVEQTELQQEQLQTWG 
TYHGS PDQSHQ VTGNHQQPPQQNTGF PRSNQ P YYN3RGVS RGGS 
RGARGLMNG YRG PANG PRGGYDGYRPS FSNT PNSG YTQSQFSAP 

RDYSGYQRDGYQQNFKRGSGQSGPRGAPRGRGGPPRPNRGMPQM 
NTQQVN 


6204 


2933 


787 


CTHNLI SLLGGRALIHFNRFLNLKI QEGEAHNIFCPAYDCFQLV"" 
PGD Z IKS WSKEMDKRYLQFDIKAFVENNPAIKWCPTPGCDRAV 
RLTKQGSOTTSGSDTLSFPLLRAPAVDCXSKGHLFCWECLGEAHEP 
CDCQTWKNWLQKITEMKPEELVGVS EAYEDAANCLWLLTNSKPC 
ANCKS P 1 QKNEGCNHMQCAKCKYDF CWI CLEEWKKHS FVHWE V I 
YRCTRYSVIQHVEEQSKEMTVEAEKKHKRFQELDRFMHYYTRFK 
NHEHS YQLEQRLLKTAKEKMEQLSRALKETEGG CPDTTF IEDAV 
HVLLKTRRI LKCS YP YGFFLEP KSTKKE I FELMQTDLEMVTEDL 
AQKVNRP YLRTPRHKI IKAACLVQQKRQEFLAS VARGVAPADS? 
EAPRRSFAGGTWDWEYLGFASPEEYAEFQYRRRHRQRRRGDVHS 
XjLSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
SURDYTPASRSEWQDStiQALSSLDEDDPNILLAIQLSLQESGLA 
LDEBTRDFLSNEASLGAIGTSLPSRLDSVPRNTDSPRAALSSSE 

llelgdslmrlgaendpfstdtlsskplsearsdfcpsssdpds 
agqdpnindnllgnimawfhdmnpqsialippatteisadsqlp 
ci kdgsegvkdvelvlpedsmfedasvsegrgtqi eewpleeni 
pgggkqhpqaw 


6205 


1 


1200 


RAHRGKMALEVGDMEDGQLSDSDSDMTVAPSDRPtQLPKVLGGb' 
SAMRAFQNTATACAPVSHYRAVESVDSSEESFSDSDDDSCLWKR 
KRQKCFNP PPKPBPFQFGQSSQKP PVAGGKKINNI WGAVLQEQN 
QDAVATELGIU3MEGTIDR5RQSETYNYLIJUCKLRKESQEHTKD 
LDKELDB YMHGG K KMGSKEEEWG QGHLKR KRPVKDRLGNRPEMN 
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SEQ 
ID 
NO: 

• • 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
ami,no acid 
sequence 


Ama.no acid segment containing signal peptide 
<A«Alanine, C-Cyateine, D=Aspartic Acid, Bo 
Glutamic Acid, Fa Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=*Leucine, M=*Methionine, N=Asparagine , 
PoProline, Q=Blutan»ine, R=Arginine, 
S=»Serine, ^Threonine, V« Valine, 
N=Tryptophan, Y-Tyrosine, X=Unknown, *=stop 
Codon, /=posaible nucleotide deletion, 
\=pcssible nucleotide insertion) 








YKGRYEITA^SQEKVADBISFRWEPKKDLIARWRIIGNlOCA~ 
XELLMETABVEQNGGLFIMNGSRRHTPGGVFLNLLKNTPSISES 
QIKDIFYI ENQKE YENKKAARKRRTQVLGKKMKQAIKSLNFQED 
DDTSRETFASDTNSALAS LDE S QEGHAEAKLEAEEA I EVDHSHD 
LDIF 


6206 


10 


1442 


IISERRERSC3JHLVCIRCSCDVVEMGSVLGLCSMASWIPCLCGS 
APCLLCRCCPSGNNSTVTRIilYALFLLVGVCVACVMLIPGMEEQ 
LNiUPGFCENEKGVVPCNILVGYKAVYRLCFGLAMFYLLLSLLM 
IK7KSSSDPRAAVHNGFWFFKFAAAIAIIIGAFFIPEGTFT7VW 
F YVGMAGAFCFILIQLVLLI D PAHSKNBS WVEKMEEGNSRCWYA 
ALLS ATALN YLLSLVAI VLFF VY YTHPAS CS ENKAF1 S VNMLLC 
VGASVMSILPKIQESQPRSGLLQSSVITVYTMYLTWSAMTNEPE 
TNCNPSLLS1 IGYNTTSTVPKEGQSVQWWHAQGI IGLILFLLCV 
FYS S 1RTS NNS QVNKLTLTSDES TL I BDGG ARSDG S LEDGDDVH 

PAVDNERDGVTYSYSFFHFMLFLASLYIMMTLTNWYRYEPSREM 
KSQWTA VWVKI SSS W I G I V t,y vwrr ,un d t.vt tmd n or* 


6207 


2924 


1471 


TVMAEAATPGTTATTS GAGAAAATAAAASPTPIPTVTAPSLGAG " 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSXPLKQEEATATELTTKSSLAA 
SSS LSS I VGPLVEMNTGE AESRNSNFATVGAGSEDWVNAI EFVP 
GQPYCGRTAPSCTBAPLQGSVTKE2SEKEQTAVETKKQLCPYAA 
VGECRYGBNCVYLHGDSCDMCGLQVLHPMDAAQRSQHIKSCIEA 
HE JCDMELS FAVQRSKDMVCG I CMEWYEKANPSERRFGILSNCN 
HTY CLKCI R KWRS AKQ FES KIIKSCPECRI TSNFVI PS B YWVEE 
KEEKQKL I LKYK3AMSNKACRYFDEGRGS CPFGGNCFYKHAYPD 
GRREEPQRQKVGTS SR YRAQRRNH ? WEL J EERENSNP FDNDEES 
WCFSLGEMLLMLLAAGGDDELTDSEDEWDliPHnpT pnpvnT m 


6208 


2924 


1471 


T vmaeaatpgttattsgagaaaataaaas PTPI PTVTAPSLGAG 

GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
S SSLS S XVGP LVEMNTGEAES R NSNFAT VGAGS EDW VNAIEFVP 
GQ PYCGRTAPS CTEAPLQGS VTKEES EKEQTAVETKKQLCP YAA 
VGECR YGENC VYLHGDS CDMCG LQV LHPMDAAQRSQH IKS CIEA 
K E KDME LS FAVQRS KDMVCG I CME VVYEKAN P SERR FG I L S NCN 
HTYCLKCIRKWRSAKQFES KIIKS CP ECR I TSNFVI PSE YWVEE 
KEEKOKLILKYKEAMSNKACRYFDEGRGSCP FGGNCFYKHAYPD 
GRRBEPQRQKVGTSSRYRAQRRNHFWELIBERENSNPFDNnREB 
VVTFEI/5EMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6209 


1758 


829 


ERLCFPCMQS KI YS YMSPNKCSGMRFPLQH KNS VTHHE VKCQGK 
PLAG I YRKREEKRNAGNAVRSAMKSEEQKI KDAR KGPLVP FPNQ 
KS BAAEPPKTPPSSCDSTNAAIAKQALKKPIKGKQAPRKKAQGK 
TQ QNRKLTDFYP VRRS SRKSKABLQS EERKRI DEL IESGKE EGM 
KIDL ZDGKGRG V I ATKQFSRGD FVVEYHGDLI EI TDAKKRE ALY 
AQDPSTGCYMYYFQYLSKTYCVDATRETNRLGRLINHSKCGNCQ 
TKLHDI DG VPHL I LI ASRD I AAGEELL YDYGDRS KAS I EAHPWL 
KH 


6210 


3761 


387 


IFGMSKLR^LLEDSGSADFRRHFVNLSPFTITVVLLLSACFVT 
SSLGGTDKELRLVDGENKCS GRVE VKVQEEWGTVCNNGWSMEAV 
S V ICNQLGCPTA I KAPG WANS S AGSGR I WMDHVS CRGNESALMD 
CKHD G WGKKSNCTHQQDAG VTCSDG SNLEMRLTRGGNMCSGR I E 
IKFQGRWGrVCDDNFNIDHAS VICRQLE CGSAVS FSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHCGV7GKHNCDHAEDAGVICSKG 
ADLSLRL VDG VTECSGRLB VR FQGEWGTICDDGWDS YDAAVACK 
2LGCPTAVTAIGRVNAS KGFGHI WLDS VS CQGHE PAVWQ CKHHE 
WGKHY CNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADVVCRQLGCGSAIiKTS YQVYS KXQATNTWL 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A*Alanine, C=Cy s teine, D=Aspartic Acid, S= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I^Isoleucine, K=*Lysine, 
I»=Leucine, M=Methionine, N«*Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TeThreonine, V»Valine, 
W^Tryptpphan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FI*SSCNGWBTSIjWDCKNWQW(3GIiTCDHYBEAKITCSAHR3PR1,V 
GGDI PCSGRVEVKHGDTWGS I CDSDFSLBAASVLCRELQCGTW 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPBGTCSH 
SRDVGWCSRYTBIRLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
I EDAHVLCQQLKCG VALSTPGGAR PGKGNGQI WRHMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTI PEES AVAC I ESGQLRLVNGGGRCAGRVE I YHEGS WGTI CD 
DS WDLSDAHVVCRQIK5CGEAINATGS AHFGEGTG P I WLDEMKCN 
GKES R I WQ CHSHGWGQQNCRH KE DAG VI CS E FMSLRLTSEASRE 
ACAGRIiEVFYNGAWGT VGKSSMS ETTVGVVCRQLGCADXGKINP 
ASLDKAMSIPMWVDQ^QCPKGPDTLWQCPSSPWEKRLASPSEET 
W I TCDNK I RLQEG PTS CSGRVE I WHGG S WGTVCDDS WDLDDAQV 
VCQQU5CGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHS E CGHKEDAAVNCTD I S VQKTPQKATTGR S S ROSS F I A 
VG ILGWLIAI FVALFFLTKKRRQRQRLAVSSRGENL VHQIQYR 
EMN SCLNADDLDLMNS S GGHSE PH 


6211 

• 


3761 


387 


I FGMSKLRMVLLEDSGSADFRRHFVNLSPFT ITWLLLSACFVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEBWGTVCNNGWSMEAV 
S V I CNQLG CPTAIKAPG WANS S AGSGRIWMDIIVSCRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRIiTRGGNMCSGRrE 
I KFQGRWGTVCDDNFNIDHASVI CRQLECGSAVS FSGSSNFGEG 
SGPI WFDDLI CNGNESALWNCKHQGWGKHNCDHAEDAGVI CSKG 
ADLSLRLVDG VTECSGRLE VRFQGE WGTI CDDGWDS YDAAVACK 
QLGCPTAVTAIGRVNAS KG FGHI WLDS VS CQGHEPAVWQ CKHHE 
WGKHYCNHNBDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADVVCRQLGCGSALKTSYQVYSKIQATNTWL 
FLSSCNGNETSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRIiV 
GGDI PCSGRVEVKHGDTWGS I CDSDFSIiEAASVLCREIiQCGTVV 
SILGGAHFGEGNGQIWAEEFQCBGHESHLSLCPVAPRPEGTCSH 
SRDVGWCSRYTEIRLVNGKTPCEGRVBLKTLGAWGSLCNSHWD 
IEDAHVLCO OLKCG VALSTPGGAR FG KRHRD T wt? umctj r»w to r» 

HMGDCPVTAiGAS LCPSEQ VAS VI CS GNQSQTLSS CNS SSLG PT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVBIYHEGSWGTICD 
DS WDLS DAHWCRQLG CGE AINATGS AH FGEGTGP I WLDEMKCN 
GKESRI WQCHSHG WGQ QNCRHKEDAGVI CS E FMS LRLTSEAS RE 
ACAGRLB VFYNGAWGTVGKS SMSETT VG WCRQLG CADKGK INP 
ASLDKAMSIPMVJVDNVQCPKGPDTLWQCPSSPWE KRLAS PS EET 
WI TCDNKIRLQ EGPTS CSGRVE I WHGGSWGTVCDDS WDLDDAQV 
VCQQI»GCGPAL KAFKEAEFGQGTG P I WLNE VKCKGNESSLWDCP 
ARRWGHSECGHKEDAA VNCTDI S VQKTPQKATTGRS SRQSS F I A 
VGILGWLLAI FVALFFLTKKRRQRQRIAVSSRGENLVHQIQYR 
EMNSCLNADDLDLMNSSGGHSEPH 


6212 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHE IGLGAEAGS G PPPAPAARBSRSRAMEEE ASS PGI, 
GCSKPHLEKLTLGirRILBSSPGVTEVTirEKPPAERHMISSWE 
QKNNCVMPEDVKNFYLWINGFHMTWS VKLDEHI IPLGSMAINS I 
SKLTQLTQS SMYSL PNAPTLADLEDDTHEAS DDQPEKPH FDS R S 
VI FELDSCNGSGKVCLVYKSGKPALAEDTEI WFLDRAL YWHFLT 
DTFTA Y YRLLITHLGLPQWQYAFTS YG I S PQAKQRVSMYKP I T Y 
NTNLLTEETDSFWKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAP RSCCCQTNP GPPS S tjRRAFRR 
RELPFPACHEIGLGAEAGSG P P PAPAARESRS RAM E E EAS S PGL 
GCSKPHLE KLTLGI TR I LES S PGVTE VTI I E KPPAERHMI SS WE 
QKl^C^PEDVKNFYLMTNGFHMTWSVKLDEHIIPLGSMAINSI 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPBKPHFDSRS 
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SEQ 
ID 
NO: 


_ predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 

Glut ami r» Ai"*"!/? PaDham/l ^1 nnin«, /"» /~0 j 

wiuuaiiuc /iCiu , r*»rnenyjL alanine, G»Glycine, 
H=Histidine, I-Isoleucine, K^Lysine, 
I^Leucine, M=Methionine, N«Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 

S-SeiririP ToThrpnn \ np v—^ral 4 Tin 
WsTryptophan, Y«Tyrosine, XsUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VI FELDSCNQSGKVCLVYKSQKPALAEDTEIWPLDRAIiYWHFLT " 
DTFTAYYRLLITHLGLPQWQYAFTSYGISPQAKORVSNYKPITY 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGOK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HELAPSAIRRAARLGLGPARWQSRAAAFYFVRGFR'KSWSFVGWV 
VI/3TS AKRTRLFFFLSKMAASSRAQVLALYRAMLRES KR FS AYN 
YRTYAVRR I R DAFRENKNVKDP VE IQTLVNKAKRDLGVIRRQVH 
IGQLY9TDKLIIENRDMPRT 


6215 


2 


1849 


FVAGGPRGSGSAABTMPE I RVTPLGAGQDVGRS CILVS IAGKNV 
KLDCGtf HMG FNDDRRFPDFSY I TQNGR riTDFLDCVI I SHFHLDH 
CGALPYFSEMVGYDGP I YMTHPTQAIC PILLED YRKIAVDKKGE 
ANFFTSQMIKDCMKKVVAVHLHQTVQVDDELE ikayyaghvlga 
AMFQI KVGSESWYTGDYNMTPDRHLGAAWIDKCRPNLLITEST 
YATTIRDSKRCRERDFLKKVHBTVERGGKVLIPVFALGRAQELC 
ILLETFWE^INLKVPIYFSTGLTEKANHYYKLFIPWTNQKIRKT 
FVQRNMFEFKHI KAFDRAFADNPGPMWFATPGMLHAGQSLQIF 
RKWAGNEKNMVI MPGYCVQGTVGHK I LSGQRKLEMEGRQVLEVK 
MQVEYMSFSARADAKGIMQLVGQAEPESVLLVHGEAKKMEFLKQ 
KI EQELRVN CYM ? ANGBTVTLPTS PS I PVGISLGLLKREMAQGL 
LPEAKKPRLIJCGTLIMKDSNFTILVSSEQALKELGLAEHQLRFTC 
RVHLHDTRKBQETALRVYS HLKSVLKDHCVQHLPDGSVTVESVt. 
LQAAAPSEDPGTKVLLVSWTYQDEBLGSFLTSLLKKGLPQAP3 


B*iO 


11 


393 


QTTR PE PRNSALRQSRS KMAWGVS S VSRLLGRS RPQLGRPMSS 
GAHGEEGSARMW KTLTFFVALPGVAVSMLNVYLKSHHGBHERPE 
FIAYPHLRIRTK PFP WGDGNHTLFHN FHVNPLP TG YE DB 


O £ 1 / 


9 


1178 


TRVGRGESGLKMEVKPPPGRPQPDSGRJIRRRRGBEGHDPXEPEQ 
LRKLFIGGLS PETTDDSLREHFEKWGTLTDCWMRDPQTKRSRG 
FGFVTYSCVEBVDAA>3CARPHKVDGRVVEPKRAVSRED£VKPGA 
HLTVKKIFVGG1JCEDTEEYNLRDYFEKYGKIETIEVMEDRQSGK 
KRGPAFVTFDDHDrVDKI WQKYHTINGHKCEVKKALSKQEMQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGSYGGGDGGYNGFGGDGGNYGGGPGYSSRGGYGGGGPGYG 
NQGGGYGGGGGYDGYNEGGNFGGGNYGGGGNYNDFGNYSGQQQS 
NYGPMKGGS FGGRSSGSP YGGGYGSGGGSGGYGSRRF 


6218 


13 OS 


906 


S CERRGF I MADDLKRFLY KXL PS VEGLHAI WSJ>RDG 1/PVl KVA 
NDNAPEHALRPGFLSTFALATDQGSKLGLSKNKS1ICYYNTYQV 
VQFNRLPLWSFIASSSANTGLrVSLEKELAPLFEELRQWEVS 


6219 


2 


890 


>iVjir'ijn(jAijA^» i KUAo/UiAEMASAGGEDCESPAPEADRPHQRPFIj 
IG VSGGTAS GKS TVCE KIMELLGQNE VEQRQRKWI LSQDR F YK 
VLTAEQKAKALKGQYNFDHPDAFDNDLMHRTLKNIVEGKTVEVP 
TYDFVTHS RLPETrVVYPADVVLFEGI LVFYSQBIRDMFHLRI»F 
VDTDSDVRLSRRVLRDVRRGRDLEQILTQYTrFVKPAFEBFCLP 
TKKYADVI IPRGVDNMVAINLIVQHIQD1LNGDICKWHRGGSNG 
RSYKRTFSEPGDHPGMLTSGKRSHLESSSRPH 


6220 


227 


764 


EQNISLBMSCTIEKALADAKALVERLRDHDDAAESLIEQTTALN 
KRVEAMKQYQEEIQELNEVARHRPRSTLVMGIQQENRQIRELQQ 
EN KEL RT S LEEHQS ALE H MS KYREQMFRLLMASKKDDPG1 1 M K 
LKEQH3 K IDM VHRNKS EGFFLDASRHILEAPQHGLERRHLEAN'Q 
NVH 


6221 


98 


916 


RW IWDLNPVS DGLELRPKYKG I LHCLTT I W KLDGLRGLYQGVTP 
NI WGAGLSWGLYFVFYNAI KS YKTEGRAERLE ATE YLVSAAE AG 
AMTLC I TU PLW VTKTRLMLQ YDAWNS PHRQ YKGMFDTLVKI YK 
YEGVRGL YKGFVPGLFGTSHGALQFMA YELLKLKYKQH INRLP E 
AQLSTVE YI S VAALS KI FAVAAT YP YQVVRARLQDQHMF YSGVI 
DVI TKTWRKEGVGGFYKG IAPNL IRVTPACC IT FVVYENVSHFL 
LDLREKRK 
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Predicted end 
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amino acid 
sequence 



HIT 



6223 



"6224" 



6225 



3259 



6226 
T227- 



23 



-25*817 



715 



133 



338 



266 



890 



Amino acid segment containing signal peptide 
<A-Alanine, C=Cyeteine, D*Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, 0=Glycine, 
H«Histidine, I«Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=^sparagine, 
PsProline, Q=Glut amine, R°Argioine, 
S&Serine, T-Threonine, WValine, 
NoTryptophan, Y=Tyrosine, 3C«Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=*possible nucleotide inserti on) 
MARELRALLLWGRRLRPLLRAP ALAAVPGGKP I LCPRRTTAQLG 
PRRN PAWSLQAGRLFSTQTAEDKEEPLHS I ISSTES VQGSTS KH 
EFQABTfOCLLDIYARSLYSEKEVPIRELISNASDALEKLRHKLV 
SDGQALP EM E IHLQTMAE KGTI T 1 QDTG IGMTQEELVSNLGTIA 
RS GS KAFLDALQNQAEAS SKI I GQFG VGFYSAFMVADR VE VYSR 
SAAPGSLG YQWLSDGSGVPE IABASGVRTGTKI 1 1 HLKSDC KEF 
SSEARVRDWTKYSNFVSPPLYtiNGRRKNTLQAIWMMDPKDVRE 
WQHEEFYRYVAQAHDKPRYTLHYKTDAPUrrRSIFYVPDMKPSM 
EDVSRELGSSVALYSRKVLIQTKATDI LPKWLRFI RGWDS EDI 
PLNLSRELLQESAIiIRKLRDVLQQRL I XFF IDQSKKDAEKYAKF 
FEDYGLFMR3GIVTATEQEVKEDIAKLLRVESSALPSG0LTSLS 
E YASRMRAGTRNI YYLCAPNRHLAEHS PYYEAMKKKDTEVLFCF 
EQFDELTLLHLREFDKKKLISVETDIWDHYKEEKFEDRSPAAE 
CLS EKETE ELMAWMRNVI^SR VTWVK VTLR1*DTHPAMVTVLEMG 
AARHFLRMQQLAKTQEERAQLLQPTLEINPRHALIKKLNQLRAS 
E PGIAQLL VDQI YENAM IAAG LVDDPRANVGRLNBIiLVKALERH 



DAWARTMAGMVDFQDEEQVKS FLENMEVE CN YHC YHEKD PDGCY 
RLVDYLEGI RKNFDEAAKVLKFNCBENQHSDS CYKLGAYYVTGK 
GGliTQDL KAAARCFLMACEKPGKKS IAACHNVGLLAHDGQVNED 
GQPDLGKARDYYTRACDGGYTSSCFNLSAMFLQGAPGFPKDMDL 
ACKYSMKACT1XMIWACANASRMYKLGDGVDKVEAKAEVLFNRA 
QQVHKEQQKGVQPLTFG 



LRTI S SMAWG PLLLTLLAHCTGS WAQS V LTQ P P S VSGAR I P&EK 



JjLS CHRIAl CKIjPFS VESRKTVMGPQGARRQAFLAFGDVTVDFT 
QKEWRIiLSPAQRALYREVTLENYSHLVSLGILHSKPBLIRRLEQ 
GE V P WGE BRRRRPG PCAG I YAEHVLRP KNLGIaAHQRQQQLQ PSD 

qsfqsdtaegqekbkstkpmafsspplrhavssrrrnsweies 
sqgqrenpteidkvlkgiensrwgafkcabrgqdfsrkmmviih 
kkahsrqklftcrechqgfrdesalliuhqnthtgeksyvcsvcg 
rgfslkanllrhqrths ge kpflckvcgrg yts ks yltvherth 

TGE K? YE CQECGRR FNDKS S YNKHLXAHSGEKP F VC KECGRGYT 
NKSYFWHKRIHSGEKPYRCQECGRGFSNKSHLITHQRTHSGEK 
PFACRQCKQS FS VKGS IiLRHQRTHSGEKPFVCKDCERSFSQKST 
LVYHQRTHSGEKPFVCRECX3QGFIQKSTLVKHQITHSEEKPFVC 
KDOGRG FI QKS TFTLHQRTHSBEKP YGCRECGRRFRDKS S YNKH 
LRAHIXJEWRPFCRDCGRGFTLKPNLTIHQRTHSGEKPFMCKQCE 
KS FSLKANLLRHQWTHS GERP FNCKDCGRGFI LKS TLLFHQKTH 
SGEKP P ICSE CGQGFI W KSNLVKHQLAHSGKQ P FVCKE CGRG FN 
WKGNLLTHQRTHSGEKPFVCNVCGQGFSWKRSLTRHHWRIHSKE 
KPFVCQECKRG YTSKS DLTVHERI HTGERPYECQECGRKFSNKS 
YYSKHLKRHLREKRFCTGSVGEASS 



TKVSELU3GSQRI*FFLPLWRRLCRCGLGPRVSPMAGPRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 



MSASSLLEQRPKGQGNKVQNGSVHQKDGIjNDDDFEPYIjSPQARP 
NNAYTAMSDSYIiPSYYSPSIGFSYSLGEAAWSTGGDTAMPYLTS 
YGQLSNGBPKFLPDAMFGQPGALGSTPFLGQHGFNFb'PSGIDFS 
AWGNNSSQGQSTQSSGYSSWYAYAPSSLGGAMIDGQSAFANETL 
NKAPGMNTIDQGMAALKLGSTE VASNVP KWGS AVGSGS I TSNI 

VASNSLPPATIAPPKPASWADIASKPAKQOPKLKTKNGIAGSSL 
PPPI KHNMDIGTWDNKGPVAKAPSQALVQNIGQPTQGSPQPVG 
QQANN S P PVAQAS VGQQTQPLPPPP P QPAQLS VQQQAAQ PTRWV 
APRNRGSGFGHNGVDGNGVGQSQAGSGSTPSEPHPVIiEKLRSIN 
N YNPKD FDWNLKHGRVFI IKS YSEDD IHRS I KYN I WCS TEHGNK 
RLDAAYRSMNGKGPVYLLFSVNGSGHFCGVAEMKSAVDYNTCAG 
VWSQDKWKGRFDVR W I FVKDVPNSQLRHIRLENNENKPVTNSRD 
TQEVPLEKAKQVLKI IASYKHTTS I FDDFSHYEKRQ 
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Amino acid segment containing signal peptide" 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine. 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6228 


47 


1978 ' 


GRRCRRRGAVMELAQBARELGCWAVEEMGVPVAARAPESTLRRL 
CLGQGADI WAY 1LQHVHSQRT VKKI RGNLLW YGHQDS PQVRR KL 
ELEAAVTRLRAEIQELDQSLELMERDTEAQDTAMEQARQHTQDT 
QRRAI»IjLRAQAGAMRRQQHTLRDPMQRLQMQLRRLQDMERKAKV 
DVTFGS I»TS AALGLE PWLRDVRTACTLRAQPLQNLIiljPQAKRG 
SLPTPHDDHFGTSYQQWLSSVElliLTNHPPGHVLAALEHLAAER 
EAEIRSLCSGDGLGDTEISRPQAPDQSPSSQTLPSMVHLIQEGW 
RT VG VLVS QRS TLLKERQVLTQRLQGLVEE VERR VLGSS ERQVL 
ILGLRRCCLWTELKAIiHDQSQELQDAAGHRQLLLRELQAKQQRI 
LH WRQLVE ETQEQ VRLLI KGMSAS KTRLCRSPGB VLALVQRKW 
PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGBLKPLP 
TVLP S I HQLHPAS P RGSS FIALS H KLGI>P PGKAS ELLLPAAAS L 
RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQIQASQEKQ 
QKENLGQAL KRLEKLLKQALERI PEIiQG I VGDWWEQPGQAALSE 
ELCQGLSLPQWRLRWVQAQGALQKLCS 


G229 
6231) 


1S71 


560 


GPSIiLGTRGTPNPARTLQIFFLXXGKRLTGRMAAVDDLQFEEPG 
NAATSLTAN PDATTVNI EDPGETPKHQPGSPRGSGREEDDELLG 
NDDSDKTELLAGQKKSSPFWTPEYYQTFFDVDTYQVPDRIKGSL 
LP I PGKNFVRLYIRSNPDLYGPFWICATL VFAIAISGNLSNFLI 
HLGEKTYH YVPEFRKVS IAATI I YAYAWL VP LALWG FLMWRNS K 
VMNIVSYSFliEIVCVYGYSljPTYTPT&TT.UT Tovrtr^rorarT imljt 
ALG I SGS LLAMTFWPAVREDNRRVALAT I VT I VLLHMLLSVGCL 
AYFFDAPEMDHLPTTTATPNQTVAAAKSS 




1723 


600 


SKMS GRSGKKKMSKL5R S ARAG V IF fc> VGRLMR Y LKKGTFKYR I S 
VGAP VYMAAVI E YLAAE I LELAGNAARDNKKARIAPRHI LLAVA 
NDEELNQLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 
PPE KRGRKATSGK3CGGKKSKAAKPRTSKKSKPKT»qnTrpr tqm qt 

SEDGPGDGFTILSSKSLVLGQKLSLTQSDISHIGSMRVEGIVHP 
TTAB IDLKBD IGKALEKAGGKEFLETVKELRKSQGPLEVAEAAV 
SQSSGLAAKFVIHCHIPQWGSDKCEEQLEETI KNCLSAAEDKKL 
KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLLFDSES 1GI YVQEMAKLDAK 


6231 


149 


870 


Ltl FSSS TMDKSitRNVLWS FGFI*LLFTAYGGLQSLQSSLYSEEG 
LG VTALSTL YG GMLLS SMFLPP LL I ER LGCKGT I IXSMCG YVAF 
S VGNFFAS W YTL I PTS I LLGIX2AAPLWSAQCTYLTI TGNTHAEK 
AGKRGKDMVNQYFGIFFLIFQSSGVWGNIjISSLVFGQTPSQETL 
PEEQLTS CGAS DCLMATTTTNSTQRPS QQhVYTliL&IYTGSGVlt 
AVLMI AAFLQP I RDVQRESE 


6232 
6233 " 


3679 
1 


1476 
2654 


^vagttmagfwVGtapLvaagrrgrwppqqlmlsaalrtlkhvl 
yysrqclmvsrnlgsvgydpnektfdkilvanrgeiacrvirtc 
kkmgiktvaihsdvdassvhvkmadbavcvgpaptsksylnmda 
imeai kktraqavhpgygflsenkefarclaaedwfigpdtha 

rQAMGDKIESKLLAKKAEVNTIPGFDGVVKDAEEAVRIARE IGY 

pvmikasaggggkgmriawddeetrdgfrlssqeaassfgddrl 

LIEKFI DNPRHIEIQVLGDKHGNALWLNERECS IQRRNQKWEE 
APS I FLDAETRRAMGEQAVALARAVKYSSAGTVE flvds kkwfy 
FLEMNTRLQVEHP VTECITGLDLVQEM I RVAKG Y PLRHKQADI R 
I NG WAVECRVYAEDP YKS FGLPS IGRLSQYQE PLHLPGVRVDSG 
1QPGSDI S I YYDPMISKLITYGSDRTE ALKRMADALDNYVIRGV 
THN IALLRE VI INS R FVKGD I STK FLS D VYPDGFKGHMLTKS E K 
NQLIAIASSIiFVAFQI^AQHFQENSRMPVrKPDIANWEI)SVKLH 
DK3/HTWASIWGSVFSVEVIX5SKT,NVTSTW^^ 
QRTVQCLSREAGGNMSIQFLGTVYKVNI LTRLAAE LNKFMLE KV 
TEDTS S VLRS PMPG VWAVS VKPGDAVAEGQEICVT EAMKMQNS 
MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 

HSTRENLNAGNFNFPS EGHLVRSTG PGGS FAKHMVAQ CVS P"KGP j 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, Methionine, N=Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, Ts Threonine, VsValine, 
WsTryptophan, Y=»Tyrosine, X»Unknown, *«-Stop 
Codon, /^possible nucleotide deletion, 
\=poasible nucleotide insertion) 








1ACSRT YFFGATHVP YLGGDS KLP KKTEQ IRLLSQ I YAAV I EAV 
LAGIAC YAKTSS LTKAKEVAEQTLGSGLDS FELI P FKAALRSKM 
TFHIHAVNNQGR IVPLDS EDS hS FVKTACMAVYDI PDLLGGNGC 
LGSWFSESFLTSQIIiVKEKDGTVTTETSSWLTAAVPRFCSWL 
VEDNEVKLSEKTHQAVRGDESF^GTYLTGGEGAYLYSSNLQSWP 
E EGNVH F FS SGLLFS HCRHG S 1 1 I SKDHMNS ISFYDGDS TS TVA 
ALLIDFKS SLLPHLP VHFHG S SNFLM I AI#FP KSK I YQAFYfl B VF 
SLWKQQDNSG1SLKVIQEDGLSVEQKRLHSSAQKLFSALSQPAG 
EKRSSLKLLSAKLPELDWFLQHFAISSISQEPVMRTHLPVLLQQ 
AE INTTHRI ESDKVI I S I VTGLPGCHASELCAFLVTLHKECGRW 
MVYRQIMDSSECFHAAHFQRYLSSALEAQQNRSARQSAYIRKICT 
RLLWLQGYTDVIDWQALQTHPDSNVKASFTIGAITACVEPMS 
CYMEHRFLFPKCLDQCSQGLVSNWFTSHTTEQRHPIJUVQLQSL 
IRAANPAAAFILAENGIVTRNEDXELILSEKSFSSPEMLRSRYL 
MYPGW YEGKIiNAGS VYPLMVQ I CVWFGRPLEKTRFVAXCKAI QS 
S I KPSPFSGNI YHI LGKVKFSDSERTMEVCYNTLANSLS IMPVL 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSLKEDSIKDWEaRQSA 
KQKPQRKALKTRGMLTQQEIRSIHVKHHJjEPLPAGYFYNGTQFV 
NFFOTKTDFTCPr^QFMNDYVEEANREIEKYNQELEQQEYHDLF 
BLKP 


6234 


1731 


404 


PRVREDMDHKSPGNKGSLVYAGIKSIVKSSIiGMVESSRHNWSGL 
D KQSDI QNLNEERI LALQLCG W I KKGTDVDVGP FLNS LVQEGE W 
ERAAAVALFNLDIRRAIQILNEGASSEKGDijiLNWAMALSGYT 
DEKNSLWREMCSTLRLQLNNPYLCVMFAFLTSETGSYDGVLYEN 
KVAVRDRVAFACKFT»S DTQLNR YIEKLTNEMKEAGNL EG I LVTG 
LTKDGVDLMESYVDRTGDVQTASYCMT,QGSPLDVLKDERVQYWI 
ENYRNLLDAWR FWHKRAEFD IHRS XLDPS S KPIAQVFVS CNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKOTSCPGCRKPLPRC 
ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKIAQFNNWFTWCHN 
CRHGGHAGHMLSWFRDHAECPVSACrrCKCMQIJ)TTGNIiVPAETV 
QP 


6235 


1 


571 


E KRDHRLPS W PRAALKVPGRGGRVGTTPE LAAGG IMATRNPPPQ 
DYES DDDS YEVLDLTE YARRHQW WN R VFGHS SGPMVB KYS VATQ 
I VMGG VTGWCAGFLFQKVGKLAATAVGGGFLLLQ IAS HSG YVQ I 
DWKRVEKDVNKAKRQ I KKRANKAAPEINNLIEEATEF I KQN1VI 
SSGFVGGFLLGLAS 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMS^EKHLF 
NLKFAAKELSRSAKKCDKEEKAEKAKIKKAIQKGNMEVARrHAE 
NAI R Q KNQ AVWFLRMS ARVDAVAARVQTAVTMGKVU'KS riAG VV K 
SMDATLKTMNLEKISAIiMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPQGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEGIAAGGVMDVNTALQBVLKTALIHDGLARGIREAAKA 
LDKJIQAHLCVLASNCDBPMYVKLVEAI*CAEHQINLIKVDDNKKL 
GEWVGLCKI DREGKPRKWGCS CWVKDYGKESQAKDVI EE YFK 
CKK 


6238 


2 


4666 


EE VPTQES VKWE INVI IKNPEI VFVADMTKNDAPALVITTQCEI 
CYKGNLENS TMTAAIKDLQVRACPFLP VKRKGKI TTVLQPCDLF 
YQTTQKGTD PQVIDMS VKSLTLKVSPVI INTMITITSALYTTKE 
TIP SET AS S T AHLWE KKDTKTLKMWr LEES N E TE KI A PTTE L VP 
KGE M I KMN IDS I FI VLEAG I GHRT VPMLIAKSRFS GEGKNN SS L 
INLHCQLELEVHY YNEMFGVWEPLLE PLEI DQTEDFRPWNLGIK 
MKKKAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKC3 
LV71JUHNI»VKAFT2AATGSSADFVKDIJVPFKI1jNSLGLTISVSPS 
DS FSVLN I PMAKS YVLKNGE SLSMDY I RTKDNDH FNAMTS LSS K 
LFFIIiTPWHSTADKIPLTKVGPJRLYTVRHRESGVERSIVCQI 
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Amino acid segment containing signal peptide 
(A^Alanine, CnCysteine, D=Aspartic Acid, B* 
Glutamic Acid, F= Phenylalanine, Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine , M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R«*Arginine, 
S=Serine, T«Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, XaUnknown, *«Stop 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion) 








DTVEGSKKVTIRSPVQIRNHFSVPLSVYEGDTLLGTASPBNEFN 
I PLGSYRS FI FLKPEDEN YQMCEGI DFEEI I KNDG ALL KKKCRS 
KNPSKES FLI N I VPEKDNLTS LS VYSEDGWDLP YIMHLWP PILL 
RNLLPYKI AYYIEGI KNSVFTLSEGHSAQ I CTAQLGKARLHLKL 
LDYLNHDWKS EYH IKPNQQDI SFVS FTCVTEMEKTDLDIAVHMT 
YNTGQTWAFHSPYWMVNICrGRMLQYKADGIHRKHPPNYKKPVL 
FS FQPNHFFNNNKVQLMVTDSELSNQFS 1 DTVGSHGAVKCKGLK 
MDYQVGVTI DLSSFNI TRIVTFTPFYMI KNKSKYH IS VAEEGND 
KWLSLDLEQCIPFWPBYASSKLLIQVERSEDPPKRIYFNKQENC 
ILLRLDNELGGIIAEVNLAEHSTVITFLDYHDGAATFLLI2JHTK 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRLKWRCRKS 
HGE VTQKDDMMMPIDLGEKTIYLVS FFEGLQRI ILFTEDPRVFK 
VTYES E KAE LAEQE I AVALQDVG I SLVNNYTKQE VAY I G ITS SD 
WWETKPKKKARMKPMSVKHTBKLEREFKBYTESSPSEDKVIQL 
DTNVPVRLTPTGHNMKILQPHVIALRRNYLPALKVEYNTSAHQS 
SFRIQIYRIQIQNQIHGAVFPFVFYPVKPPKSVTMDSAPKPFTD 
VS I VMRSAGHS QI SR I KYFKVLI QEMDLRLDLGFI YALTDLMTE 
AEVTENTEVELFHKDIEAFKEEYKTASLVDQSQVSLYEYFHISP 
I KLHLSVSLSSGREEAKDS KQNGGLIPVHSLNLLLKS IGATLTD 
VQDVVFKLAFFELNYQFHTTSDLQS EVIRE YS KQAJ KQMYVL IL 
GLDVLGNP FGLI REFSEGVEAFPYEPYQGAIQGPEEFVEGMALG 
LKALVGGAVGGLACAAS KITGAMAKGVAAMTMDEDYQQKRREAM 
NKQPAGFRBG ITRGGKGLVSG F VSGI TG IVT KP I KGAQKGG AAG 
FF KGVGKGL VGAVAR PTGG I IDMASSTFQG 1 KRATETSEVBSLR 
PPRFFNEDGVIRPYRLRDGTGNQMLQKIQFYREWIMTHSSSSDD 
DDDDDDDDBSDLNH 


6239 


2108 


634 ■"" 


KPGMAGKGSSGRRPLLLGLLVAVATVHLVICPYTKVEESFNLQA 
TKDLLYHWQDLEQ YDHLE FPGVVPRTFLGPWIAVFSSPAVYVL j 
SLLEMS KFYSQL I VRGVLGLGVI FGLWTLQKEVRRHFGAMVATM ; 
FCWVTAMQ FI ILM FYCTRTLPtn/LALP\AO^LALAAWriREEWARFI 
WLSA FAI IVFRV^LCLFLGLLiLLALGNRKVSWRALRHAVPAG 
ILCLGLTVAVDS YFWRQLTWPEGKVLWYNTVLCJKSSNWGTS PLL 
WYFYSALPRGLGCSLLFIPLGLVDRRTHAPTVLALGFKALYSLL 
PHKELR FI I YAFPMLNITAARGCS YLLNNYKKSWLYKAGSLLVI 
GKLVVNAAYSATALYVSHFNYPGGVAMQKLHQLVP PQTDVLLHI 
DVAAAQTGVSRFLQVNSAWRYDKREDVQPGTGMLAYTH I LMEAA 
PGLLAL YRDTHR VLAS WGTTGVS LNLTQLP P FNVHLiQT KLVLL 
ERLPRPS 


C240 


2202 


1176 


HERGDS LKEPTS IAESSRHPS YRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSS LKS AQGTG F3LGQLQS I RSEGTTSTS YKS LANQ 
TRNGSLS YDS LLTPSDSPDF3S VQAGPE PDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHR3PSPVRYDNLSRHIVASLQEREXL 
LRQS P PLPGREEE PGLGDS G IQST PGSGHAPRTS SS SDDSKRS P 
LGKTPLGRPAVPRFGKPDGLRGRGVGS PEPG PTAPYLGRSMS Y3 
SQKAQ PGVS ETEE VALQPLLTPKDEVQLKTT YSKSNGQPKSLG S 
ASPG PGQP P LSS PT RGG VKKVSGVGGTTYEI S V 


6241 


3 


1341 


RNAEEKKRLSLQREKI IARVS I DNRTRALVQALRRTTDPKLCIT 
RVEELTFHLLEFPEGKGVAVKERIIPYLLRLRQIKDETLQAAVR 
EILAL IG YVDPVKGRGIRILS IDGGGTRGWAIiQTLRKLVELTQ 
KPVHQLFDYICGVSTGAI LAFMLGLFHMPLDECEELYRKLGSDV 
FSQNVI VGTVKMS WSHAF YDS QTWEN I LKDRMGSALM IETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPGINSHYLGGCQYKM 
^AIRASSAAPGYFAEYAWJNDLHQPGGLLLIOTSAIAMHECKC 
LWPDVPLSCIVSLGTGRYESDVRNTVTYTSIiKTKLSNVINSATD 
TEEVHIMLDGLLPPDTYFRF1TPVMCENIPLDESRNEKLDQLQLE 
GLJCZIERNEQKMKKVAKILSQEKTTLQKINDWIKLKTDMYEGLP 
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Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D-Aspartic Acid, E*= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M^Methionine, N-Asparagine, 
Pe Proline, Q=Glut amine, RsArginine, , 
S=Serine, T=Threonine, v=Valine, • 
WsTryptophan, Y=Tyrosine, X«Unknown, *=Stop 

f / o pwiac»A«n; nucieociae oexecion, 
\=possible nucleotide insertion) 








PFSKL 


6242 


198 


1310 


QHFLPGAETWSPGAAVCTAi^FPGRSLAAFPRPAAPRRAVEMGE 
SS ED I DQMFS TLLGBM DIiLTQSLGVDTLP PPDPNPPRAEFNYS V 
GFKDLNESLNAIiEDQDLDALMADLVADISEAEQRTIQAQKBSLQ 
NQHKS ASLQAS I FSGAAS LGYGTNVAATG 1SQYEDDLPPP PAD P 
VLDLPLPPPPPEPLSQEEKEAQAKADIQKLALEKLKBAKVKKIjV 
VKVHMNDNS TXS LMVDERQLARD VLDNL FBKTHCDCNVD WCLYE 
IYPELQIERFFEDHENWBVliSDWTRDTBNKILFLEKEEKYAVF 
KNP QN FYLDNRGKKES KETNE KMNAKNKESLLEVRL I LQSGRKE 
KDVCS IFKS FASENNGKI 


6243 


1509 


614 . 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPM 
TS RAS SRRLACGPQTRAGAETRSTAMI RAKS AARDTRRATCRSA. 
AGT PS PTTMTCLTDVPTG CAAVB PTARLPAAAWAS TI TTGCCPA 
MGQAGAGPAGRKGSEAGGGPGRAHHAHPSPLPREPRVRTGPPAH 
SPT PGS IDP S PELSWGSAGVTQE S PLLDFVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLILVPLK 
GPPIIiAPILSLTPILSRWSCYFPRSRIAQGWHLS 


6244 




1745 


FEHAYASQFGTFLGNNESERCKLKLQQKTMSLWSWVNQPSElLSK 
FTNPLFEANNLVrWPSVAPQSLPLWEGIFLRWNRSSKYLDEAYE 
EMVNIIEYNKELQAKVNILRRQLAELETEDGMQESP 


6245 


81 


1L4B 


LSLRNAKYSFPQELISLFSMTDIiNDN I CKRYIKMI TNI VILSLI 
I CI S LAFW IIS MTASTY YGNLRP ISP WRWL FSWVP VLI VSNGL 
KKKSLDHSGALGGLWGFILTIANFS FFTSLLMFFLSSS KLTKW 
KGE VKKRLDS E Y KEGGQRNW VQ VFCNGAVPTELALLYK I ENGPG * 
BIPVDFS KQYSAS WMCLS JjLAALACS AGDTWAS B VGP VLSKSS P t 
RL I TTWE KVP VGTNGGVT WGLVS S LLGGTFVGI AYFLTQLI FV 
tODIJ>ISAPQWPIIAFGGIAGLjLGSIVDSYIiGATMQYTGLDBSTG 
MWNS PTNKARH I AGKP ILDNNAVNL FSS VL IALLLPTAAWG FW 
PRG 


6246" 


1177 


359 


SLWPVJILMDDSI^IQISLQLLCVYTANFPNGCSSLCWSSCGQHPV " 

QATHRGAVSNSLMLC ILKLASQM PL ENTTVQQMVFMLLSNLALS 

HDCKGVlQKSNFLQNFLSLAIiPKGGNKHLSNLTIIiWLKLLLNIS 

SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLIFHNVCFS 

PANKPKILANEKVITVLAACIiESENQNAQRIGAAALWALiyjTYQ 

KAKTALKS PSVKRRVDEAYSIiAKKTFPNSEANPLNAYYLKCIiEN 

LVQLLNSS 


6247 


3 


1678 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGLVPLTDDTSHAGP ' 
PGPGRALTiECDHLRSG VPGGRRRKD WS CSLLVAS LAGAFGS S FL 
YGYNLSVVNAPTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV* 
S I FAIGG t» VGTLI VKM IG KVLGRKHTLLANNGFAI SAALLMACS 
W3AGAFEMLIVGRFIMGIDGGVALSVLPMYL5EISPKEIRGSLG 
OVrAIFlCIGVPTGQLLGLPBLLGKBSTWPYLFGVIWPAWQL 
LSLPFLPDSPRYLLLEKHNBARAVKAFQTFLGKAHVSQBVBEVL 
ABSRVQRS IRLVSVLEIiLRAP YVRWQ VVT VI VTMACYQLCGLNA 
I W FYTNS I FGKAGI P PAK3 P YVTLSTGG I ETLAAV FS GLVI EHL 
GRRPLL IGGFGLMGLFFGTLT I TLTLQDHAPWVP YLS I VG I LAI 
I ASFCSGPGGI PFI LTGEFFQQSQRPAAFI I AGTVNWLSKPAVG 
LLFPFIQKSLDTYCFLVFATICITGAIYLYFVLPETKNRTYAEI 
SQAFS KRWKAYPPEEKI DSAVTDGKINGRP 


6248 


5^ 


1773 


VP PPRMMAAVP PGLE ?WNR VR I PKAGNRS AVTVQNPGAALDLCI 
AAVI KECHLVILS LKSQTIjDAETDVLCAVL YSNHNFJ4GRHKPHL 
ALKQVEQCLKRLKNMNLEGS IQDLFELFS SNENQ PLTTKVCWP 
SQPWELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFI I 
LNLVMVGLVSRLWVLYKGVLKRLILLYEPLFGLLQBVARIQPMP 

YFKDFTFPSDITBFLGQPYFEAFKICKMPIAFAAXGINKLLNKLF 
LINEQSPRASEETLIXJISKKAKOMKINVQEIfcn^ 
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Anuno acid segment containing signa^. peptide 
(A=Alanine, C«Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
lUHistidine, I&Isoleucine, K=Lysine, 
I^Leucine, M=Methionine, N=Asparagine, 
P« Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KBESSE FD VRAFCNQLKHKATQETSFDFKCS QSRLKTTKYS S Q K ~ 
VIGTPHAXSFVQRFREA2SFTQLSEEIQMAWWCRSKKLKAQAI 
FliGNKLLKSNRLKHLEAQGTSLPKKLECIKTSIOJHLLRGSGIK 
T S KHHLRQRRSQNKFLRRQRKPQR KLQSTLLRE IQQ FSQGTRKS 
ATDTSAKWRLSHCTVHRTDLY PNS KQLLNSG VSMPV 1 QTKEKM I 
HENLRGIHENETDSWTVMQINKNSTSGTIKBTDD1DDIFALMGV 


6249 


56 


1773 


VPPPR^4MAAVPPGLEPWNRVRIPKAGNRSAVTVQNPGAALDLCI " 

AAVIKECHLVILSLKSQTLDAETDVLCAVI/ySNHNRMGRHKPHL 

ALKQVEQCLKRLKNMNLEGSIQDLFELPSSNENQPLTTKVCVVP 

SQPWELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFII 

LMLVMVGLVSRLWVLYXGVLKRLILLYBPLFGLLQEVARIQPMP 

YFKDFTFPSDITEFLGQPYFBAFKKKMPIAFAAKGINKLLNKliF 

LINEQSPRAS3ETLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 

KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 

VIGTPHAKSFVQRFREAESFTQLSEEIQMAWWCRSKKUKAQAI 

FLGNKLLKSNRLKHLEAQGTSLPKKLEC3KTSICNHLLRGSGIK 

TS KHHLRQRRSQNKFLRRQRKPQRKLQSTLLRE IQQFSQGTRKS 

ATDTSAKWRLSHCTVHRTDLYPNS KQLLNSGVSMPVIQTKEKMI 

HENLRGI HENETDS WTVMQINKNS TSGTI KETDDIDD I PALMGV 


6250 


232 


1306 


IJUU^IMALPFRKDLEKYKDLDEDELLGNLSETELKQLETVIiDD 
LD PENALL P AGFRQ KNQTS KSTTG PFDREHLLS YLEKEALEHKD 
REDYVPYTGEKKSKIFIPKQKPVQTFTEEKVSLDPELEEALTSA 
SDTELCDLAAILGMHNL ITNTKF CN I MGSSNG VDQEHFSNWKG 
EKILPVFDEPPNPTNVE BSLKRTKENDAHLVE VNLNNIKDTI PI P 
T L KD FAK AL ETNTHVKC FS LAATR S ND P VATAFAE ML KVNKTLK 
SLNVESMFI TGVG ILAL I DALRDNETLAELKI DNQRQQLGTA VE 

LEMAKMLEENTNILKFGYQFTQQGPRTRAANAITKNNDLVRKRR 
VBGDHQ 


6251 


62 


972 


T PGSGPMSAWAAAS IjSRAAARCLLARG PG VRAAPPRD PRP SH PE 
PRGCGAAPGRTLHFTAAVPAGHNKWS KVRHI KGPKDVERS R I FS 
KLCLN IRLAVKEGGPNP EHNSNLANI LE VCRSKHMPKSTI ETAL 
KMEKS KDTYIiI,YEGRGPGGSSIiLIEALSNSSHKCQADIRHI LNK 
NGG VMAVGARHS FDK KG VI WEVEDREKKAVNLERA LEMA I EAG 
AEDVKETBDEEERNVPKFICDASSLHQVRKKLDSLGLCSVSCAL 
EFIPNSKVQIAEPDLEQAAHLIQALSNHEDVIHVYDNIE 


4252 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVP PKKDKLQTKRXKPRRYWEE ' 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYI LKKSR1SKKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHS XAKTRSRLE VAEAEEEETS IXAARSELLLAEEPGFLEGB 
DGEDTAKI CQADI VEAVDI ASAAKHFDLNLRQFG P YRLNYSRTG 
RHLAFGGRRGH VAALD WVTKKLMCB INVMEAVRD IRFLHSEALL 
AVAQNRWLHI YDNQGIELHC IRRCDRVTRLEFLP FHFLLATAS E 
TGPLT YLDVS VGKI VAALNARAGRLD VMS QNP YNAVI HLGHS NG 
TVSLWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGT YQPLS TRTLPHGAGHLAFS QRGLLVAGMGDWNI WA 
GQGKASPPSLEQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNP YRSRKQRQEWEVKALLBKVPABL I C 
LDPRALAEVDVI sleqgkkeqi ERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKIjQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKTTAYILKKSRISKKPQV 
PKKPREWKKTPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHSKAKTRSRLE VAEAEEEETS I KAARSSLLLAEEPGFLEGE 
DGEDTAKICQADrVEAVDIASAAKHFDItNLRQFGPYRIiNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCEINVMEAVRD I RFLHSEALL 
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Ammo acid segment containing signal peptide ' 
(A= Alanine, C-Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=ltysine, 
LnLeucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
S*Serine, T= Threonine, V-Valine, 
W=Tryptophan, Y«Tyrosine, *=Unknown, *=Stop 
Codon, /possible nucleotide deletion, 
\=possible nucleotide insertion) 








AVAQNRWLH I YDNQGI B LHCI RRCDRVTRLE FI*P FH FLLATASB 
TG FLT YLDVS VGKI VAALNARAGRLDVMSQNP YNAVIHLGHS NG 
T VS*LWS PAMKB PLAKI LC3 IRGGVRAVAVDSTGT YMATSGLDHQL 
KIFDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 
GQGKASPPSLBQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRXQRQBWEVKAIjIiEKVPAELIC 
LDPRALAE VD VI S L EQGKKEQ I ERUSYDPQAKA PFQPKP KQKGR 

SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPALAPGAAAFAGIi 
GGAPRFPPRGSAAGRTMLLKEYRICMPLTVDBYKIGQLYMISKH 
SHEQSDRGEGVEWQNEPFEDPHflGNGQFTEKRVYLNSKLPSWA 
RAWPK1 FYVTEKAWN YYPYTITEYTCS FLPKFSIHIETKYEDN 
KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESBDPKHFK 
SEKTGRGQLREGWRDSHQPIMCSYKLVTVKFEVWGLQTRVEQFV 
HKWRDI LL IGHRQAFAWVDEW YDf-TTMDDVRE YEKNMHEQTNI K 
VCNQHSS PVDD I ESHAQTST 


6255 


l 


1444 


PTKPQQELLVSLATVIFVASQKALSVESKAVIKQQLESVSNGWT 
VYRIARQASRMGNHDMAKELYQSLLTQVASKHFYFWLNSLKEFS 
HAEQCLTGLQBENYSSALSCIAE3LKFYHKGIASLTAASTPLNP 
I^FQCEFVKLRIDLLQAFSQLICTCNSLKTSPPPAIATTIAMTL 
GNDLQRCGRI SNQMKQ SMEEFRSLASRYGDLYQAS PDADS ATLR 
NVELQQQSCIiLI SHAI E ALI LDPESAS FQE YGS TGTAHADS E YE 
RRMMS VYNHVLEEVES LNGKYTPVS YMHTACLCNAI IALLKVPL 
S FQR Y ? FQKLQSTS I KLALS PS PRNPAEP IAVQNNQQLALKVEG 
VVQHGSKPGLFRK1QSVCLNVSSTLQSKSGQDYKIP1DNMTNEM 
EQRVEPHNDYFSTQFLIiNFAI LGTHN ITVESSVKDANG I VWKTG 
PRTT I FVKSLEDPYSQQ IRLQCQQAQQPLQQQQQRNAYTRF 


6256 


1 


1542 


xwnvnucnntir (WfnviiV tr OJUCO lu A<-> V trtrt\tr\3 L F1A.1 U.SWAf >fl 

vdeqeaaaeslsnlhlkeekikpdtngawktnanaektdeeek 
edraaqs llnkli rsnlvdntnqve vlqrdpnsplys vks feel 
rlkpqij^qGvyamgfnrpskiqenalplmlaeppqnliaqsqsg 

TGKTAAFVLAMLSQVEPANKYPQCLCLS PTYBLALQTGXVT EQM 

gkfypelkijvyav^gnklergqkiseqivigtpgtvldwcsklk 
fidpkkikvfvldeadvmiatqghqdqsiriqrmlprncqmllf 
satfedsvwkfaqkwpdpnvtklkreeetldtikqyyviicssr 

DEKFOALCNLYGAITIAQAMI FCHTRKTASWIiAAELSKEGHQVA 
LLSGEMMVEQRAAVIERFREGKEKVLVTTNVCARGIDVEQVSW 
INFDLPVDKDGNPDNETYLHRIGRTGRFGKRGLAVNMVDSKHSM 
NILNR I QEHFN K K IERLDTDDLDE I E KI AN 


6257 


210 


615 


AF I PAMAEt*! QKKLQGE VE K YQQLQKDLSKS MS GRQ KLEAQ LTE 
NNIVKEEI^LEGSmA/FKLLGPVLVKQELGEARATVGKKLDYI 
TAEI KRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6258 


210 


615 


AFI PAMAEL IQKKLQGEVEKYQQLQKDLSKSM^GRQKLEAqLTB 

NNIVKEEIJU^LDGSNVVFKLLGPVLVKQELGEARATVGKRJLDYI 

TAEIKRYESQLRDLERQSBQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6259 


2 


1540 " 


I LEKGFPS Q CHP ER KWKVDD VLESSQENEDDHFWELLFHNNKT V 
S V3NGDRGS KTFNLGTDPVSLRN YP YK2 CDS CEMNI»KNTSGL 1 1 
SKKNCSRKKPDEFNVCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
NYHQDLSQPSFGQSFEYSKNGQGFHDEAAFFTNKRSQIGETVCK 
YNECGRTF IESLKLNISQRPHLBMEPYGCSI CGKS PCMNLRFGH 
QRALTKDNP YBYNEYGE I FCDNS AFI I HQGAYTRKI LRE YKVSD 
KTWEKSALLKHQIVHMGGKSYDYNENGSNFSKKSHLTQLRRAHT 
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Amino acid segment containing signal peptide - 
(A^Alanine, C^Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, M»Methionine, N»Asparagine, 
P^Proline, QsGlutamine, R«Arginine, 
S=Serine f T=»Threonine, V*Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGE KP YE CKQCGKTFCVKSNLTEHQRTHTGBKP 
YECWACGKSFCHRSALTVHQRTHTGEKPPICNEOGKSFCVKSNL 
IVHQRTHTGEKPYKCNECGKTFCEK3ALTKHQRTHTGEKPYECN 
ACGKTFSQRS VLTKHQR IHTRVKALSTS 


6260 


2081 


1436 


GTGPE IHACAHAS ARAPG SRAMALRELKVCLLGDTGVGKS S I VW 
RFVEDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
RALAPMY YRGS AAA 1 1 VYnTTJTP PTT?Q TT .vnuvvpt DouriDDM t 

WAI AGNKCDL I D VRB VMERDAKDYADS IHAI FVBTSAKNAI NI 
NELFIEISRRIPSTDANLPSGGXGFKLRRQPSEPKRSCC 


6261 


3 


1188 


FWYRI,GPGTRSRWPRRGSWAASLVPRGPSPAAiVTSPCPPDPIiR 
SPACEPCRPDFAPRPALIiIjRSGPRSAPAVTGKPALKGQPGPWPG 
MAEVS IDQS KLPGVKB VCRDFAVL EDHTLAHSLQEQE I EHH LAS 
NVQRNRLVQHDLQVAKQLQEBDLKAQAQLQKRYKDLEQQDCEIA 
QEIQEKIAIEAERRRIQEKKDEDIARLLQBKELQEEKKRKKHFP 
EFPATRAYADSYYYEDGGMKPRVMKBAVSTPSRMAHRDQBWYDA 
EIARKLQEEBLLATQVDMRAAQVAQDEEIARLLMAEEKKAYKKA 
KEREKSSLDKRKQDPEWKPKTAKAANSKSKESDEPHHSKNERPA 
RPPPPIMTDGEDADYTHFTNOQSSTRHFSKSESSHKGFHYXH 


6262 


2 


1759 


PECHSQGLCSVHRPGKVPQARMSGLVLGQRDEPAGHRLSQEKIL 
GSTRLVSQGLEALRSBHQAVIiQSIiSQTIBCIjQQGGHEEGLVHEK 
aku LiKK.bM KN I ELGLSEAQ VMLALASHLSTVES E KQKLRAQVRR 

u:qenqwlrdeiagtqqrlqrseqavaqleeekkhleflgqlrq 
ydedghtsbekegdatkdslddlfpneeeedpskglsrgqgata 
aqqggyeiparlrtlhnlviqyaaqgryevavplckqaledler 
tsgrghpdvatmlnilalvyrdqnkyiceaahllndals ires tl 

GPDHPAVAATLNNLAVLYGKRGKYKEAEPLCQRAL£IREKVLGT 
NHPDVAKOLNNLAIjLCQNQG ICYEAVERYYQRALA 1 YEGQLGPDN 

pnvartknklascylkqgkyaeabtlykeiltrahvqefgsvdd 
dhkpiwmhaeereemsksrhheggtpyaeyggwykackvssptv 
nttlrnlgal yr3qgkleaaetr»eecalrsrrqgtdp i sqtkva 
EUJZES DGRRTS qegpgds VKFEGGE d as vavews gdgsgtlqr 

SGSLGKIRDVLRR 


6263 


1 


2408 


RELDSLADLPERIKPPYANGLSTSHIiRSSSVEDVKLlISEGRPT 
IEVRRCSMPSVICEHTKQFQTISBESNQGSLLTVPGDTSPSPKP 

bvfsnvperdlsnvsnihssfatsptgasnskyysadrnl I KMT 

APVNTVMDS PVHLEPS SQ VG VI QNKS WEMPVDRLETLSTRD F IC 
PNSNIPDQESSLQSFCNSENKVLKENADFLSLRQTELPGNSCAQ 
DPASFMPPQQPCSFPSQSLSDAESISKHMSLSYVANQEPGILQQ 
KNAVQ I ISSALDTDNSSTKDTENTFVLiGDVQKTDAFVP VYSDS T 
IQEAS PNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAPS KLTYK 
SSSGHEVENSTTD^QVISHBKENKLESLVLTHLSRCDSDLCE^5N 
AGM P KGNLNEQDPKHC PES E KCU>S IEDEESQQ5 I LS SLENHSQ 
QS TQ PEMHKYGQLVKVEliEENAEDDKTENQI PQRMTRNKANTI4A 
NQSKQILASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPQPVQVSPSLLQAKEKTQQSLAAIVDSLKLDEIQPYSSBR 
ANP YFE YLHIRKKIEEKRKLLCSV I PQAPQYYDEYVTFNGS YLL 
DGNPLSKICIPTITPPPSLSDPCiKELFRQQEVVRMKLRLQHSIE 
RERliI VSNEQEVLRVHYRAARTLA1JQTLP FS ACTVLLDAEVYNV 
PLDSQSDDSKTSVRDRFNARQFMSWLQDVDDKFDKLKTCIjLMRQ 

QHEAAALNAVQRLEWQLKLQELDPATYKSISIYEIQEFYVPLVD 
VNDDFELTPI i 


6264 " 


143 


1960 


KHRQENNALDMAPEIHMTG PMCL IENTNGEL VANPEALKI LS A i" 
TQPVVWAlVGIi YRTGKS YLMN KLAGKNKGFSLGST VKSHTKG I 
WMWCVPHP KKPEHTLVLLDTBGLGDVKKGDNQNDS WIFTLAVLL 
SSTLVYNSMGTIKQQAMDQLYYVTELTHRIRSKSS PDENENEDS 
ADFVS F FPOFVWTLRDFS LDLSADGQPLTPDEYLE YSLKLTQGT 
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| SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptfde"" 
(AoAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine. 
H=Histidine, I^Isoleucine, K=Lysine, 
L«Leucine, M»Methionine, N«Asparagine, 
P»Proline, Q=Glutamine, R»Arginine, 
S=Serine, T«Threonine, V«Valine, 
^•Tryptophan, Y=Tyrosine , X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRbCIRKFFPKKKCPVPDLPIHRRKIAQLEKliQDE" 
ELDPEFVQQVADFCS YI FSNSKTKTLSGG I KVNGPRLBSLVLTY 
I N AXS RGDL PCM3RAVLALAQ I ENS AAVQ KAI AHYDQQMGQKVQ 
LPAETLQELIjDLHRVSEREATEVYMKNS fkdvdhlfqkklaaql 
DKKRDD FCKQNQ EAS SD RCSALLQV I FS PLEE B VKAG I YSKPGG 
YCLFIQKLQDIiEKKYYEBPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQILTE KEKE IEVECVKAESAQAS AKMVEEMQI KYQQMMEE K 
EKSYQEHVKQLTEXM£RERAQLLEEOEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6265 


143 


1960 


KHRQEKNALDMAPE IHMTGPMCLIENTNGELVANPEALKILSAt 
TQPVVWAIVGLYRTGKSYI^KLAGKNKGFSLGSTVKSHTKGI 
W M Wt_ V FH F KJv. P BHTJj VTjIjDTEGLGD VKKGDNQNDSWZFTLAVLL 

sstlvynsmgtinqqamdqlyyvtelthr i rsksspdeneneds 
adfvsffpdfvwtlrdfsldleadgqpltpdeyleyslkltqgt 
sqkdknfnlp r lci r kf fpkkkcfvfdl p i krrklaqleklqde 
eldpe fv qqvadfcs y i fsnsktkt ls gg i kvngprles lvlty 
inai s rgdl p cmenavlalaq i ens aavq kaiahydqqmgqkvq 
lpaetlqelldlhrvsereatevymknsfkdvdhlpqkklaaql 
dkkrddfckqnqeas sdrcs allqvi fs ple eevkagi ys kpgg 

YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQILTEKEKEIEVECVKABSAQASAKMV^EMQIKY0X2f - IMEEK 
EKSYQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6266 


276 


1421 


GSHQkQMLVPCFLYSLQNRKPSLYGSLTCQG IGLDG I PEVTASE 
GFT WE I NKKS IHI S CPKENAS S KFLAP YTT FSRIHTKS I TCU) 

TGODfiPT r , \fOOD1'nr , irMT7'TUn7lOM/ , inT nntrr nriTnTTiTLiii.iji.iiL - l -. _ 

laa k*lAjJjL» v © £>fc» 1 LKa I MKJl W yASEf GELRR V LEGHVFD VnCCR FF 
PSGLWLSGGMDAQLKI W3 AEDAS CWTFKGHKGG I LDTAI VDR 
GRNWSA5RD GTARLWDCGRSACLG VLADCGSS INGVAVGAADN 
SINLGSPEQMPS ERE VGTEAKMLL LAREDKKLQCLGLQSRQL VF 
LFIGSDAFNCCTFLSGFIiLIjAGTQDGNIYQLDVRSPRAPVQVIH 
RS G A P VT iS LLS VRDG FI AS QGDG S CF JCVQQ DLD YVTELTGAD CD 
PVYKVATWEKQIYTCCRDGLVRRYQLSDL 


6267 


3 


622 


lgmmkknnsakrgpqdgnqqpappekvgwvrkfcgkgifreiwk " 

nrywlkgdqlyisekevkdekniqevfdlsdyekceelrksks 

rskknhskftlahskqpgntapnliflavspeekeswinalnsa 

ITRAKNRILDEVTVEEDSYTjAHPTPnRJiTfTOMQPPT3'D , i»Dr , i3T mb 

vaststsdgmltldliqeedps peeptslc 


6268 


160 


1368 


hrelcqnlpaglssalidnpltlllsidtyvmlqepvtfqdvav 
dfsreewgllgptqrteyrdvmletfghlvsvgwettlenkela 
pnsdipeeepapslkvqessrdcalsstledtlqggvqevqdtv 
lkqmesaqekdlpqkkhfdnresqansgaldtnqvslqkidnpe 
sqansgaldtnqvllhkipprkrlrkrdsqvksmkhnsrvkihq 
kscerqkakegngcrktfsrstkqitfirihkgsqvcrcsecgk 
ifrnpryfsvhkkihtgerpyvcqdcgkgfvqsssltqhqrvhs 
gerpfecqecgrtfndrsaisqhlrthtgakpykcqdcgkafrq 
sshlirhqrthtgerpyacnkcgkaftqsshlighqrthnrtkr 

KKKQPTS 


6269 


2886 


1449 


hasaptrrnmaaas plrdchawkdarlplsttsneacklfdatl" 
tqyvkwiwdkslggi egcls klkaadptfvmghamatglvl igt 
gssvkldkbldlavktmvei srtqpltrreqlhvsavetfangn 
fpkacelweqilqdhptdmlalkfshdaypylgyqeqmrdsvar 
i yp fwtpdi plss yvkgi ys fglme tnfydqae klaxeals inp 
tdawsvhtvah ihemkae i kdglbfmqhsetlwkdsdmlachny 
whwalyliekgeyeaaltiydthilpslqandamldwdscsml 
yrlqmegvsvgqrwqdvlpvarkhsrdhillfndahflmaslga 
hdpqttqbllttlretasespgejnc^ 
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10 
MO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginir.e, 
S=Serine, T=»Threonine, V«Valine, 
W«Tryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








DGNPDRVLELiLLP I RYRI VQLGGSNAQRDVFNQLI* I HAALNCTS 
SVHKNVARSLLMERDALKPNS PLTERLIRKAATVHLMQ 


6270 


23 


2066 


SVl-VTIiGSkGDGRPPTYHLEEMEQEPQNGEPAElKllREAYKKAH 
FLFVNKGLNTDELGQ KEEAKNYYKQG IGHLLRG 1 S I SSKE S EHT 
GPGWESARQMQQKMKETLQNVRTRLE ILEKGLATS LQNDLQEVP 
KLYPEFPPKDMCEKLPEPQSFSSAPQHAEVNGNTSTPSAGAVAA 
PASLS LPSQS CPAEAP PAYTPQAAEGHYTVSYGTDSGEFSS VGE 
EFYRNHSQPPPLETLGLDADEL1LIPNGVQIFFVNPAGEVSAPS 
YPG YLR I VRFLDNSLDTVLNR PPGPLQVCDWL YPL VPDRS P VLK 
CTAGAYMFPDTMLQAAGCFVGWLSSELPEDD^ELFEDLLRQMS 
DLRLQANWNRAEEENE FQI PGRTRPSSDQLKEASGTDVKQLDQG 
NKDVRHKGKRGKRAKDTSSEEVNLSHIVPCEPVPEEKPKELPEW 
S E KVAHNI LSGAS WVS WG LVKGAE ITGKA IQKGAS KLRER IQPE 
EKP VEVS PAVTKGL Y I AKQATGGAAKVS QFLVDGVCTVANCVG K 
ELA.PHVKKHGSKLVPESLKKDKDGKSPLDGAMWAASSVQGFST 
VWQGLECAAKCI VNNVS AETVQTVRYK YGYNAGEATHHAVDS AV 
NVGVTAYN INNIG I KAMVKKTATQTGHTLLEDYQI VDNSQRENQ 
EGAANVNVRGEKDEQTKEVKEAKKXDK | 


6271 


32 


1058 


GCGVKTAGMVGREKELS IHFVPGSCRLVEEBVNI PNRRVLVTGA1 
TGLLGRAVHKE FQQNNWHAVGCG FRRARPKFEQVNLLDSNAVHH 
IIIIDFQPHVTVHCAAERRPDWENQPDAASQIjNVDASGNIJUCEA 
AAVGAFLIYISSDYVFDGTNPPYREEDIPAPLNLYGKTKLDGKK 
AVLENNLGAAVLRI P I L YGB VE KLEES AVTVMFDKVQFSNKS AN 
MDHWQQRFPTHVKDVATVCRQLAE KRMLDPS I KGTFHWS GNEQM 
TKYEMACAXADAFNLPSSHLRPI TDS PVLGAQRPRNAQLDCS KL 
ETLG I GQRTP FRI G I KESLW PFLI DKRWRQTVFH 


6272 


1136 


£28 


GAVMEE3AAAPGRTEGVl,ERQGAPPAAGQGGALVELTPTPGGlSlH 
VS P YHTHRAGDPLDLVALAEQVQKADSFI RANATNXLTVIAEQI 
QHLQEQARKVLEDAHRDANLHHVACNI VKKPGNI YYLYKRESGQ 
QYFS I ISP KEWGTS CPHDFLGA YKLQHDLSWTP YEDI E KQEAKI 
SMMDTLLSQSVALPPCTBPNFQGLTH ) 


6273 


256 


843 


SCPRVSPEC^LGCQVMFSLPI^CSPDHIRRGSCWGRPQDljaAn 
SAAWNS KCHPG AGAAMARQHARTLW YDRPRYVFMEFCVEDS TDV 
HVL I EDH R I VF S CKNADG VE L YNE I E F YAKVNSKDS QDKRS S RS 
IT C FVRKWKEKVAWPRLTKEDI KP VWLS VDFDJNWRD WEGDEEME 
LAHVEHYAEVRDNTYCVLPT | 


6274 


56 


1142 


AAAAMAAAAGGGAGAARSLS R FRGCIAGALLGDCVGSFYEAHD'l' [ 
VDLTSVLRHVQSLEPDPGTPGSERTEALYYTDDTAMARALVQSL 
LAKEAFDE VDMAHRFAQE YKKDPDRG YGAG VVTVFKKLLNP KCR 
DVFE PARAQFNGKGSYGNGGAMRVAG I SLAYS SVQDVQKFARLS 
AQLTHA5S LG YNGAI LQALAVHLALQGES SSKHFLKQLLGHMED 
liBGDAQSVLDARELGMEERPYSSRLKKIGELLDQASVTREEWS 
ELGNG I AAFESVPTAI YCFLRCMEPDP B I PS AFNSLQRTLI YS I 
S LGGDTDT I ATMAGA IAGAYYGMDQVPES WQQS CEG YE ETDI LA 
QSLHRVFQKS 


6275 


20 " 


565 


GGHDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 

KASRATLRSHFRDKYRLPKNETDESQIQMAGGDVELPRELAKMI 

EEDTEEEEEKASVLGQLASLPGLNLGSLKDKAQATLGDLKQSAE 
KCHVM 


6276 


797 


97 


TLLPLPPLPlJrEGWILLNTGLEGTVAENPVPIVHTPSGNILTLE| 
SCI^QLATHPGHWGIHLQIAEPAALRPSIAMARLSSLGLLHWP 
VW VGAK ISHGS FS VP GHVAGR EIJiTAVAEVFPHVTVAPG WPEEV 
LGSGYREQLLTDMLELCQGLWQPVS FQMQAMLLGHSTAGA IGRL 
LAS S P RAT VTVEHNPAGGD Y AS VRTALLAARAVDRTRVY YR L P Q 
3YHKDLLAHVGRN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E** 
Glutamic Acid, F«=Phenyl alanine, G=Glycine. 
H=»Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M-Methionine, N=Asparagine , 
P«Proline, Q^Glutamine, R=Arginine, 
S=>Serine, T=Threonine, V» Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknovn, *=>Stop 
Codon, /=pos3ible nucleotide deletion, 
\«possible nucleotide insertion) 


6277 


4 Goo 


2744 


MAFRTEMGLYYSYPKTIVEAPSPiiNGVWMIMNDKLTEYPLVINT 
LKRFNLYPEVILASWYRIYTKIMDLIGIOTKICWTVTIGEGLSP 
TES CEGLGD PACFYVAVI F I LNGLMMALFFI YGTYLS GSRLGGL 
VTVLCFFPNHGECTRVMWTPPLRESFSYPFLVLQMLLVTHILRA 
T KL YRGSL IALCI SNVF FMLP WQFAQFVLLTQ IAS LFAVYWGY 
IDICKLRKI1YIHMISLALCFVLMFGNSMLLTSYYASSLVIIWG 
rLAMKPHFLKINVSELSLWVIQGCFWLFGTVILKYLTSKIFGIA 
NDAHIGNLLTSK?FSYKDFDTLLYTCAAEFDFMEKETPLRYTKT 
LLLPWLVGFVAIVRKI ISDMNGVLAKQQTHVRKHQFDHGELVY 
HALQLLAYTALGILIMRL3CLFLTPHMCVMASLTCSRQLFGMLFC 
KVHPGAI VFAjCLAAMS IQGSANLQTQWUrVGEFSNIiPQEBLIEW 
I KYSTKPDAVFAGAMpTMAS VKLSALRPI VNHPHYEDAGLRART 
KI VYSM YSRKAAEEVKREL I KLKVNYY II»B ESWCVRRS KP GCS M 

PE I WDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNSVYKVLEVV 
KB 


6278 


3 


823 


ILFRLVLLSLVYLLNSVATEERKPAEVblVEGQQYAWGTVLLL 
I RI ILE YCQG VDN1 PSVTTDMLTRLSDLLKYFNSRSCQLVLGAG 
AIOWGLKTITTKMLALSSRCLQLIVHYI PVIRAHFEARLPPKQ 
YSMLRJIFDHITKDYHDHIAEISAKXVAIMDSLFDKU.SKYEVKA 
PVPSACFRNICXQMTKMHBAI FDLLPEEQTQMLFLRINAS YKLH 
LKKQLS HLNV INDGG PQNGLVTAD VAFYTGNLQALKGLKDLDLN 
MAEIWEQXR 


oz /a 


127 


1687 


GGAMASDGARKQFWKRSNSKLPGSIQHVYGAQHPPFDPLLHGTI, 
LRSTAKMPTTPVKAKRVSTFOEFESNTSDAWDAGEDDDELLAMA 
AESIjNSEVVMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 
PPSPPSGDI*RLVKSVSESHTSCPAESASDAAPLQRSQSLPHSAT 
VTLGGTSDPSTLSSSALSEREASRLDKFKQLLAGPNTDLEELRR 
LS WSG I P KPVRPMT WKLLSGYLPANVDRRPATLQRKQ KEYPAFI 
EHYYDSRNDEVHQDTYRQIHlDIPRMSPEALIIiQPKVTEIFERI 
LFI WAIRHPASGYVQG INDLVTPFFWFICBYIEABE VDTVDVS 
G VP AE VLCNIEADTYW CMSKLLDG I QDN YTPAQPG IQMKVKMLE 
ELVSRIDEQVHRHLDQHEVRYLQFAFRWPaNNLLMREVPLRCTI R 
LWDTYQSEPDGFSHFHLYVCAAFLVRWRKEILEBKDFQELLLFL 
QNLPTAHWDDEDISLLLAEAYRLKFAFADAPNHYKK 


6280 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEBEEEDB 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
WDGRLGDRYNPPVDATPDTRELEFNEIKTQVELATGQIiGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YS QKAFCG I YS KOGQ I FMS ACQDQTIRLYDCRYGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADIS SQI LFSGGDDAICKVWDRRTMREDDPX 
PVGAXiAGHQDGITFIDSKGDARYLISNSXDQTIKLWDIRRFSSR 
BGMEASRO^TC^NWDYRWQQVPKKARRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKVWYDtiLSGHIVKK 
LTNHKACVRDVS WHPFBEKI VS SS WDGNIiRL WQYRQAEYFQDDM 
PESBECASAPAPVPQSSTPFSSPQ 


62B1 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEBEEEDE ' 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQPIQALLDSEEBNDRA 
WDGR LGDRYNP P VDATPDTRELEFNB IKTQVELATG QLGLRRAA 
QKHS FPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YS KDGQ I FMSACQDQT I R LYDCRYGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSrJYIHICNIYGEGDTHTALD 
LRPDERRFAVFS IAVS SDGR EVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADISSQ I LFSGGDDAI CKVWDRRTMR EDD PK 
PVG ALAGHQDG ITP1 DSKGDARYL XSNS KDQT I KLWDIRRFS SR 
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2D 

NO: 


" Predicted 
beginning 
nucleotide 
location 
cor r e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aopartic Acid, E= 
Glutamic Acid, pa Phenylalanine, G=Glycine, 
H=»Histidine, I=lsoleucine, K= Lysine, 
LsLeucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V* Valine, 
W=xTryptophan, Y=>Tyrosine, X-Unknovm, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRQAATQQNW DYRWQQVP KKAWRKLKLPGHSS LMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKWVyDLLSGHIVKK 
LTNH KACVRDVS WHP PEE KI VS S S NDGNLRL WQ YRQAE YFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6282 


12S 


906 


RMAACRALKAVLVDLSGTLHIEDAAVPGAQEALKRLRGASVflR 
FVTNTTKES KQDLLERLRKLE FD I 5EDE I FTS LTAARS LLERKQ 
VR PMLLVDDRALPDFKG I QTS D PNAWMGLAP EHFH YQ I LNQAF 
RLLLDG A P L I AI HKAR YYKRKDGLALG PG PF VTALEYATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
GILVKTGKYRASDEEKINPPPYLTCESFPHAVDHILQHLL 


6283 


140 


1043 


IiSLFGIHVMNPFMSMSTSSVRKRSEGBEKTLTGDVKTSPPRTAP™" 
KKQLPSIPKNALPITKPTSPAPAAQSTNGTHASYGPFYLEYSLL 
AEFTLWKQKLPGVYVQ PS YRSALMWFGVI P IRHGLYQDGVFKF 
TVYIPDNYPDGDCPRLVFDIPVFHPLVDPTSGEIiDVKRAFAKWR 
RNHNH I WQ VLMYARRVF YK I DTAS PLN PEAAVLYEKDIQLFKS K 
WDS VKVCTARL FDQP KI E DPYAI S FS PWNPS VHDEARE KMLTQ 
KKKPEEQHNKSVHVAGLS WVKPGSVQ PFSKEEKTVAT ~ ! 


6284 


1 


2879 


RSVIPGSTISSRWPGLSRPRFMAAHEWDWPQREELIGQISDIRV 

QNLQ VERENVQKRTPTRWI NLHLE KCN PPLEVKDL F VDI QDGKI 

LMALLEVLSGRNLLHEYKS SSHRI FRLNNIAKALKFLEDSNVKL 

VSIDAAEIADGNPSLVLGLIWNIILFFQIKELTGNLSRNSPSSS 

LAPGSGGTDSDSS FPPTPTAERSVA I S VKDQRKAI KALLANVQR 

KTRX YGVAVQDFAGSWRS G LAFLAV I KAI DPS LVDMKQALENST 

RENLEKAPSIAQDALHIPRLLEPEDIMVDTPDEQSIMTYVAQFL 

ERPPELEAEDIFDSDKEVPIESTFVRIKETPSBQESKVFVLTEN 

GERTYTVNHETSHPPPSKVFVCDKPESMKBFRLDGVSSHALSDS 

STEFMHQIIDQVLQGGPGKTSDISEPSPESSILSSRKENGRSNS 

LPIKKTVHFEADTYKDPFCSKNLSLCFEGSPRVAKESLRQDGHV 

LAVEVAEEKEQKQESSKIPESSSDKVAGDIFLVEGTNNNSQSSS 

CNGALBSTARHDEESHSLSPPGENTVMADSFQIKVNLMTVEALE 

EGDYFBAIPLKASKFNSDLIDFASTSOAFNKVPSPHETKPDEDA 

E AFENHAEKLGXRS I KSAHKKKDSPE PQ VKMDKHE PHQDSGEEA 

EGCPSAPEETPVDKKPEVKBKAKRKSTRPHYEEEGEDDDLQGVG 

EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 

YFPHYEVPLAAVLEAYVEDPEDLKNBEMDLEEPEGYMPDLDSRE 

EEADGSQSS S SSS VPGESL PS ASDQ VLYLSRGG VGTTPAS E PAP 

LAPHEDHQQRETKENDPMDSHQSQBSPNIiENIANPIiEENVTKES 

ISSKKXEKRKHVDHVESSZ.FVAPGSVQSSDDLEEDSSDYSIPSR 

TSHSDSSIYLRRHTHRSSBSDHFSLCSVEERSRSG 


6285 


2157 


1331 


SCKTENLLEMWWFQQGLSFLPSALVIWTSAAFIPSYirAVTLHlf" 
IDPALPYI SDTGTVAPE KCLFGAMLNI AAVLCIAT I YVRYKQ VH 
ALS PEENVI IKLNKAGLVLGILSCLGLS I VANFQKTTLFAAHVS 
GAVLTFGMGSLYMFVQT I LS YQMQ PKI HGKQ VFW IRL LLVI WCG 
VSALSMLTCSSVLHSGNFGTDLEQKLHWNPEDKGYVLHMXTTAA 
EWSMSFSFFGFFLTYI RDFQKI SLRVEANLHGLTLYDTAPCP IN 
WRT? TP r.T . Q i> r> T 


6286 


1619 


276 


JCAGASCCGSANPWSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
PFSIPAASEIADLSNI INKLLKDKNEFHKHVEFDFLIKGQFLRM 
PLDKHMEMENI SSBE WEI E Y VEKYTAPQPEQCMFHDDWI S S I K 
GAEE W I LTGS YDKTS R IWSLBG KS I MT I VGHTDWKDVAWVKKD 
SLSCI,LIiSASMDQTI LLWEWNVERNKVKALHCCRGHAGSVDS IA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEEDEMEESTNRPRECKQKT 
EQLGLTRTPI VTLSGHMEA VSSVLWSDAEB I CS AS WDHT I R VWD 
VES GS LKS TLTGNKVFNCI S YSPLCXRLASGS TDRH IRLWDPRT 
IO)GSLVSr,SLTSHTGWVTSVKWSPrHBQQLISGSLDNlv:<LWDT 
RSCKAPLYDLAAHHDKVLSVDWTDTGLLLSGGADNKLYSYRYSP 
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1 seqT 

ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 

1 nucleotide 

I location 

1 corresponding 
to first 
amino acid 
residue of 
amino acid 

1 sequence 


Ammo acid segment containing signal peptide"" 
(A-Alanine, CaCysteine, D*Aspartic Acid, E= 
Glutamic Acid, FoPhenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine / NaAsparagine, 
PaProline, Q=Glutamine, R=«Arginine, 
StiSerine, T«Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 
TTSHVGA " ' 


6287 


278 


1482 


MQFFFNFQIGIaRSTSGKEKYSGDAGPLGDALQLPLQCLAIiDEDF 

APAKLQVQKILCDLLLPENLKEGLKESSWSSLPCTKNRPFDFHS 

VMEESQSLNEPSPKQSEEIPBVTSEPVKGSLNRAQSAQSINSTE 

MPARBDCLKRVSSEPVLSVQEKGVLLKRKLSLLEQDVIVNBDGR 

NKLFCKQGETPNEVCMFSLAYGDI PEEL IDVS DFECSLCMR LFFE 

PVTTPCGHSFCKNCLERCLDHAPYCPLCJCESLKEYLADfiRVCVT 

QLLEELIVKYLPDBLSERKKIYDEBTAELSHLTKNVPIFVCTMA 

YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 

GCMLQIRNVHFLPDGRSWDTVGGKRFRVLKRGMKDGYCTADI B 
YLEDV 


6288 


1 


S 743 


VTLYPCRGLVC5NLLLGASGMASGCKIGPSILNSDIJU!IUSAECLR~ 
MLDSGADYLHLDVMDGHFVPNI TFGHPVVESLRKQLGQDPFFDM 
HMMVSKPEQWVKPMAVAGANQYTFHLBATENPGALIKDIRENGM • 
KVGLAI KPGTS VE YL APWANQI DMALVMT VEPG FGGQ KFMEDMM 
PKVHWLRTQFPSLDIEVDGGVGPDTVHKCAEAGANMIVSGSAIM 
RS EDPRS VINLLRNVCSEAAQKRSLDR 


6283 
6250 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPSILNSDLANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKOLGODPFFDM 
HMMVS KPEQWVKPMAVAGANQ YTFHLEATENPGALI KDI RKNGM 
KVGLAIKPGTS VE YLAP WANQ IDMAL VMT VK PG FGGQKFMEDMM 

PKVHWLRTQFPSLDIEVDGGVGPDTVHKCAEAGANMIVSGSAIM 
RSBDPRSVINLLRNVCSEAAQKRSLDR 




3 


1356 


TiiGKWLm V ¥ ETVAfc'TLACLPRPRLRRRRRRRRRRM I SRYTRKA" 
VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHBKDTSSQSKS 
VI TRESS FTSADTGNSLSAFPS YTGAGI STEGSSD FSWG YGELD 
QNATEKVQTMFTAIDELLYEQKLSVHTKSLQEECQQWTASFPHL 
RlLGRQIITPSBGYRLYPRSPSAVSASYETTLSQBRDSTrFGIR 
GKKLHFSSS YAHKASS I AKSSS FCSMER DEEDS 1 1 VSEG I IEE Y 
LAFDHIDIEEGFHGKKSEAATEKQiCLGYPPIAPFYCMKEDVLAY 
VFDS VWCKWS CMEQLTRSHWEGFASDDESNVAVTRPDS ESS CV 
LSELHPLVLPRVPQSKVLYITSNPMSLCQASRHQPNVNDLLVHG 
MPLQPRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TSSLS YTVQS TRRRNP P PRTLHP IS TS HSCAETPRS VEE I LRGA 
RVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVEHVSTVGPQRQMK 
PHGDS SRAQSAVVDEPN YQQPQERLLLPDFFPRPNTTQS FLLDT 
QYRRS CAVE YPHQARPGRGS AGPQLHGSTKS QSGGRP VS RTRQG 


6251 


1732 


602 


LVAKMASS ASART PAGKR VI NQEELRRLMKEKQRLS TSR KR I ES 
PPAKYNRLGQLSCALCNTPVXSELLWQTHVLGKQHREKVAELKG 
AKE ASQGS SAS SAP QS VKRKAP DADDQDVKRAKATI iVPQVQP ST 
SAWTTNFDKIGKEFIRATPSKPSGLSLLPDYEDEEEEE5EEEGD 
GBRKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
A P 1 1 PHSGS I EKAE I H EK WERRENTAEALPEGFFDD PBVDARV 

RXVDAPKDQMDKEWDEFQKAMRQVNTISEAIVAEEDEEGRLDRQ 
IGEIDEQIECYRR VEKLRNRQDEIKNiCI»XEI LTT wmw bt?i?t?vi 

ADSDDEGELQDLLSQDWRVKGALI* 


6252 


1835 

2382 H- 


1142 | 


TGPGAMKMVAPWTRFYSNSCCLCCHVRTGTIIjLGVWYLIINAVV 
LLlLLSAIiADPDQYNFSSSELGGDFEFMDDANMCTAIAlSLLMI 
LICAMATYGAYKQRAAWI I PFFCYQIFDFALNMLVAITVL I YPN 
SIQBYIRQLPPNFPYRDDVMSVNPTCLVLI ILLFIS I ILTFKGY 

LIS CVl^CYR YI NGRNSSD VLVYOTSM)TTVLLPP YDDATVNGA 
&KEPPPPYVSA 


6253 




1035 

i 
1 


kwctlgtvdvhpigwcainskilvpprtihakftdwkgylmkrl" 

/GSRTLPVDFHIKMVESMKYPFRQGMRLEVVDKSQVSRTRMAW 
WIGGRLRLLYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
atnino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KaLysine, 
L=Leucine, M=Methionine, W^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=p03sible nucleotide insertion) 








MSERRSDMAHHPTFRKIYCDAVPYIiFKKVRAVYTBGGWFBEGMK 
LEAI DPLNLGNI C VAT VCKVLLDG YLM I CVDGGP STDGLD W FC Y 
HASSHAIFPATFCQKNDIELTPPKGYEAQTFNWENYLEKTKSKA 
APSRLFNMDCPNHGFKVGMKLEAVDLMEPRLICVATVKRVVHRL 
LSIHFDGWDSEYDQWVDCBSPDIYPVGWCBLTGYQLQPPVAAEP 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKXPLLEDD 
PQGARKISSEP VPGEI IAVRVKEEHLDVASPDKASSPBLPVSVB 
MIKQETDD 


6294 


354 


| 1814 


AQLTTKGRTVAGGVRWIPSPFPDLBLYSCTT/lTnorn'bTrT ouun 
KNV IATAS DYDMAE ITMIRPSFDVSPWAGLIGASVLWCVSVT 
VFVWSCCHQQABKKHKNP P YKFIHMLKG I S I YPETLSNKKKI I K 
VRRDKDGPGREGGRRNLLVDAAEAGLLSRDKDPRGPSSGSCIDQ 
LPIKMDYGEELRSPITSLTPGBSKTTSPSSPBEDVMLGSLTFSV 
DYNFPKKALWTIQEAHGLPVMDDQTQGSDPYIKMTILPDKRHR 
VKTRVLRKTLDP VFDE T FTF YG I P YSQLQDLVLHFLVLS FDRFS 
RDDVTGEVMVPLAGVD PSTGKVQLTRDI I KRNIQKCISRGELQV 
SLS YQPVAQRMTVWLKARHLQKMDIAGI^GNPYVKVNVYYGRK 
R I AKKKTHVKKCTLNP IFNES F I YDI PTDLLPDI S I EFLVI DFD 

RTTKNEVVGRLILGAHSVTASGAEHWREVCBSPRKPVAKWHSLS 
EY 


6295 


279* 

* 


617 


VSSALLTGATSG3DAAKSEGASAS PLSCTNAVAMDRPDEGPPAR - 
TRR LSSSES PQRDPP P P PP PPPLLRLPLPP PQQRPRLQEETEAA 
QVLADMRGVGIiGPAXiPPPPPYVILEEGGIRAYFTLGAECPGWDS 
TIESGYGEAPPPTESLEALPTPEASGGSLEIDFQWQSSSFGGE 
GALETCSAVGWAPQRLVDP KSKEEAX II VBDEDEDERBS MRS 9R 
RRRRRRRRKQRKVKRES RERNAERMES ILQAL ED IQLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFLERRDLIIQHIPGETgVKAFLNHPR 
ISILINRRDEDIFRYIiTNLQVQDLRHISMGYKMKLYFOTNPYFT 
NMVIVKEFQRNRSGRLVSHSTPIRWHRGQEPQARRHGNQDASHS 
FFS WFSNHSLP EADRIAE I IKNDLWVNPLRYYLRERGSRTICC vv 
QEMKKRKTRGR CE WI MEDAPDYYAVEDI FS B I SDI DETIHD I K 
ISDFMETTDYFETTDNBITDINENICDSENPDHNEVPNNETTDN 
NESADDHETTDNlJESADmNENPEDNNKNTDDNEENPNNNBNTY 
GNNFFXGGFWGSHGNNQDSSDSDNEADEASDDEDNDGNEGDNEG 
SDDDGNEGDNEGSDDDDRDIE YYEKVI EDFDKDQADYEDVI E 1 1 
SDESVEEEG1 EEG IQQDEDI YEEGNYEEEGSEDVMEEGEDS DDS 
DLEDVLQVPNGWANPGKRGKTG 


6296 


727 


1199 


RHCGCDAQGACDS LPPTGTSS PVTARMAi P^ARCCVWLLDGTT V 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
I^TSIiQFPSPFSGTISFGSFSDSGIFPLGSQCCLGFQQFSISGK 
KWAL IHKRVRLS VFGARWGR I YFGK 


6297 


1 


922 


QRAAAAS PSSCGPRGAEYGALMAMEGYWRFLALLGSALLVGFLS 
VIFALVWVLHYREGLGWDGSALEFNWHPVLMVTGFVFIQGIAII 
VYRLPWTWKCS KLLMKS I HAGLNAVAAI LAI IS WAVFENHNVN 
NIANMYSLHSWVGLIAVICYLLQLLSGFSVFLLPWAPLSLRAFL 
MP IHVYSG I VI FGTVI ATALMGLTEKL I FSLRDPAYS TFPPEGV 
FVNTLGLL ILVFGALI FWIVTRPQWKRPKEPNSTILHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELKNEVAARKRNLALDEAGQRSTM 


6298 


3 


985 ' " 


SVPLRRI^LSGTI^GAGTTTKMAVARLAAVAAWVPCRSWGWAAV 
PFGPHRGLS VLLARI P QRAPR WLPACRQKTSLSFLNR PDL PNLA 
YKKLKGKS PGI I F I PG YLS YMNGTKALA I EEFCKSLGHACIRFD 
YSGVGSS DGNSEESTLGKWRiO)VLS I IDDLADGPQ I LVGSSLGG 
WLMIJIAAIARPEKVVALIGVATAADTLVTKFNQLPVELKKEVEM 
KGVWSMPSKYSEBGVYNVQYSFIKEAEHHCLLHSPIPVNCPIRI. 
LHGMXDDI VPWHTSMQVADRVLS TD VDVILRKHSDHRMREKADI 
2LLVYTIDDLIDKLSTIVN 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i Predicted end 

! mini anh^A 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Aruno acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L^Leucine, M»Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
j-oerine P i -inreonins, v& valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6299 


512 


814 


BCDLEGIMPNVTISLSLPTNGSPLQDILVHPCVTSLDSAILTSS 
S I DAMDESAFS G P YKFPFTPPLES FNIiCPYTSQ VP VPP I LGFYQ 
MKEEBVQLRNNH 


6300 


121 


692 


AAPS CWSQRG VPAAGTPS S PRLLVSRAAAPS AG PWGAWRQGARA 
AQSPFSrPNSSSVPYGSQDSVHSSPEDGGGGRDRPVGGSPGGPR 
LVTGS LPAHL5 PHM FGGFKCPVCS KFVSSDEMD LHL VWCLTKPR 

ITYNBDVLSKDAGECAICLEEIiQCGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


264 


GKF VP VNW B P PQPLF FP KYLRC YRCLLETKE LG CL LGSDI CLTP~~^ 
AGSSCITLHKKKSSGSDVMVSDCRSKEQMSDCSNTRTSPVSGFW 
I FSQ YCFLD FCND PQNRGL YTP 


6302 


440 


745 


I FGFLHLFHMEHS PLLV'CALFAHVPFSSSCGSSVALHSDPCLLS 
PVLLNCLPGDLRPLDELYAQKLKYKAISEELDHALNDMTSL 


6303 


2 


1951 


YWKEYGGGLLWQSWQEKHPGQALSSEPWNFPDTKEEWEQHYSQL 
YW YY LEQFQ YWEAQGWT FDAS QSCDTDT YTS KTEADDKNDEKCM 
KVDLVSFLSSPIMGDNDSSGTSDKDHSEILDGISNIKLNSEEVT 
QSQLDSCTSHDGHQQLS EVS S KRECPASGQSEPRNGGTNEESNS 
SGNTNTDP PAEDSQKSSGANTSKDR PHASGT DGD ESEEDP PEHK 
PS KLKRS HELD I DENPASDFDDSGS LLGFKYGSGQ JCYGG I PNFS 
HRQVRYLEKKVKLKSKYLDMRRQIKMKWKHIFFTKESEKPFFKK 
SKILSKVEKFLTNVNKPMDEEASQBSSSHDNGHDASTSCDSEEQ 
DMSVKKGDDLLETNNPEPEKCQSVSSAGELETENYERDSLLATV 
PDEODCVTQEVPDSRQAETEAEVKKKKNKKKNK KVNGLP PE I AA 
VPELAKYMAQRYRLFSRFDDGIKLDREGWFSVTPEKIABHIAGR 
VSQSFKCDVWDAFCGVGGNTXQFALTGMRVIAIDIDPVKIALA 
RNNAEVYGIADKIEFIOGDFLLLASFLKADVVFLSPPWGGPDYA 
TAETFDIRTMMSPDG FEI FRLSKK1TNNI VYFLPRNADIDQVA3 
IAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


1 


143 8 


HRAR VD RS RES PGGDLRHPGRVRRD 1 TLSGHPRLSTQH V VLLRE 
DEVGDPGTKDLGHPQHGS P IQETQSE WTLVSPbPGSDMAALPA 
NRATSGLT L WPHTAEGRDLLG AENRALTGGQQAED PTLASG AYQ 
WPSSVEKLQGSWCDAETLLSSSRTGGQAPPWLTDHDVQMLRLL 
AQ3EVVDKARVPAHGQVLQVGFSTEAALQDLSSPRLSQLCSQGL 
CGI»I KR PGDLPE VLS FHVDRVLGLRRSLPAVARRFHS PLLP YR Y 
TDGGARP V I WWAPDVQKLSDP D3DQNS LALGWLQ YQALLAHS CN 
WPGQAP CPGIHHTE WARIiALFDFLLQVHDRLDR YCCGFEPEPS D 
PCVEERLREKCRNPAELRLVH ILVRS SDPSHLVYIDNAGNLQHP 
EDKLNFRLLEGIDGFPESAVKVLASGCU2NHLLKSLQMDPVFWE 
SQGGAQGLKQVLQTLEQRGQVLLGHIQKHNLTLFRDEDP 


6305 


59 


420 


NMIWRGRSTYRPRPRRSVPPPELZGPMLEPGDEEPQQEEPPTES 
RDPAPGQEREEDQGAAETQVPDLEADLQELSQSKTGDECGDGPD 
VQGKI LTKSEQFKM P EGR 


6306 


1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC" 

KFNCHKRCATRVPNDCLGEALINGDVPMEEATDFSBADKSALMD 

BSSDSGVI PGSH55ENALHASEEEEGEGGKAQSSLGYI PLMRWQ 

S VRHTTRKS S TTLREGWVVHYSN KDTLRKRHYWRLD CKC ITLFQ 

NNTTNRYYKE IPLSEILTVESAQNFSLVPPGTNPHCFEIVTANA 

TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 

APGHAPHRQASLS I SVSNSQ I Q ENVDI ATVYQ I FPDE VLGSGQF 

GVVYGGKHllKTGRJDVAVKVIDKIjRFPTKQESQLRNBVAILQSLR 

HPGIVNLECMFETPEBCVFVVMEKLHGDMLEMILSSEKGRLPERL 

TKFLrTQILVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 

DFGFARIIGEKSFRRSWGTPAYLAPEVLLNOGYNRSLDMWSVG 

VIMYVSLSGTFPFNEDED INDQ IGNAAFMYPASPWSHI SAGAID 

LINNIiQVKMRKRYSVDKSLSHPWLQEYQTWLDLRELEGKMGER 

YITHESDDARWEQFAASHPLPGSGLPTDRDIiGGACPPQDHDWQG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C- Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
HaHistidine, I=»Isoleucine, KeLysine, 
L^Leucine, M-Methionine, N=Asparagine, 
P»Proline, Q=Glut amine, R=Arginine, 
SaSerine, T=Threonine, V»Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
LAERISVL ' 


6307 


213 6 


589 


CFLLPRGRDPEPPEAGAAAPCAPGAPDMSFRKVVRQSKFRHVFG 
Q P VKNDQ CYED I RVS RVTWDSTFCAVN PKFLAVT VEASGGGAFI* 
VLPLS KTGRI D KAY PTVCGHTG PVLDIDWCPHNDEV I ASG S ED C 
TVMVWQI PENGLTS PLTEP VWLEGHTKRVGI I AWHPTARNVLL 
SAGCDWVLXWKVGTAEBLYRLDSLHPDLIYNVSWNHNGSLFCS 
ACKDKSVRI IDPRRGTLVAEREKAHEGARPMRAI FLADGXVFTT 
G FSRMS ERQ LALWDP ENLE E PMALQEIjDSSNGAIjIjPFYDPDTS V 
VYVCGKGDSSIRYFEITEEPPYIHFLNTPTSKEPQRGMGSMPKR 
GLEVS KCE I AR FY KLHERKCEP I VMTVPRXS DLFQDDLYPDT AG 
PEAALEAEEWVSGRDADP1LISLREAYVPSKQRDLKISRRNVLS 
DSRPAMAPGSSHLGAPASTTTAADATPSGSIARAGEAGKLEEVM 
QELRALRALVKEQGDRI CRLEEQLGRMBMGDA 


63 08 


2 


111B 


GRPTRPEKMLLSLVLHTYSMRVLLPSWLLGTAPTYVLAWGVWR 
LLSAFLPARFYQALDDRLYCVYQSMVLPFFENYTGVQILLYGDL 
PKNKENI IYLANHQSTVDWIVADILAIRQNALGHVRYVLKEGUC 
WLPLYGtfYFAQHG GI YVKRS AKFNEKEMRNKLQS YVDAGT P MYL 
VI FPEGTR YNPEQTKVIiS AS QAFAAQRGLAVLKHVLTPR I KATH 
VAFDCM K^LDA I YDVTVVYEGKDDGGQRRESPTMTEFLC KECP 
KIHIHIDRIDKKDVPEEQEHMRRWLHERFEIKDKMIilEFYESPD 
PERRKRPPGKSVNSKLSIKKTLPSMLILSGLTAQMIiMTDAGRXX» 
YVNTWIYGTLLGCLWVTIKA 


6309 


220 


563 


L VAEVKE PCS LPMLS VDMENKENGS VGVKNSMENGRPPD P ADWA 
VMDWNYFRTVG FEEQASAFQEQE I DGKSLLLMTRNDVLTGLQL 
KLGPALKIYEYHVKPLQTKHLKNNSS 


6310 


36 


979 


GPROTKFIilLSSVMCETbRIGKAWPQSSGQE^YWTPRTHSSASS" 
AQRGSLA3LNVAAAGLWADCDQPLYDCPMCGLICTITYHILQEHV 
DLHLEENSFQQGMDRVQCSGDLQLAIIQLQQEEDRKRRSEESRQE 
IEEFQ KI£RQYGLDNSGG YKQQQLRHME I EVNRGRM P P SEFHRR 
KADMMESLALGFDDGKTKTSGI I EALHR YYQNAATDVRRVWLSS 
VVDHFHSSliGDKGWGCGYRNFOMLLSSliLQNDAYNI)CI.>KGMLIP 

CIPKIQSMIEDAWKEGFDPQGASQIillRLQGTKAWIGACEVYlL 
LTSLRV 


6311 


1 


675 


P VW WNS CEG PRLAAAARTGHG VGRRARiACLGEP R VKAAVKLTL 
AS BCLKRDDGLKGSRTAATAS DSTRRVSVRDKLLVKEVAELEANLi 
PCTCKVHF PDPNKLHC FQLTVTPDEG Y YQGG KFQ PETE VPDAYN 
MV? PKVKCLTKIWHPNITETGEI CLSLLRBHSIDGTGWAPTRTL 
KDVVWGLNSLFTDLI^FDDPLNIEAABHHLRDKEDFRNKVDDYI 
KRYAR 


C312 


213 


1400 


GDELVKREAGMK^PGVGVFGTGSSARViVPLLRABGFTVEALW 
GKTEEEAKQLAEEMNIAFYTSRTDDILLHQDVDLVCISIPPPLT 
RQISVKALGIGKNWCEKAATSVDAFRMVTASRYYPQLMSLVGN 
VLRFLPAFVRMfOQt, I SEHYVGAVMI CDAR 1 YS GSLLS PS YGW IC 

DELMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
IRG I RHVTS DDFC PFQMLMGGGVCSTVT EtNFNKPGAF VHEVMW 
GSAGRLVARGADLYGQKNSATQEELLLRDSLAVGAGI.PEQGPQD 
VPLLYLKGMVYMVQALRQSFQGQGDRRTWDRTPVSMAASFEDGL 
YMQS WDAI KRS SRSGE WEAVEVLTE E PDTNQNI*CEALQRNNL 


6313 


2 


2071 


QRSGAARLAFLPSPFS PACVHRSPLSPHGCWF YFVWFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKKILD 
RLNEQREQDRFTDITL I VDGHH FKAHKAVLAACS KFFYKFFQEF 
TQEPLVEIEGVSKT-lAFRHLIEFTYTAiCLMIQGEEEANDVWKAAE 
FLQMLEAIKALEVRNKENSAPLEENTTGKNEAJCKRKIAETSNVI 
THSLPSAESEPVEIBVEIAEGTIEVEDEGIETLBEVASAKQSVK 
YIOSTGSSDDSAIALLADITSKYRQGDRKGQIFCEDGCPSDPTSK 
QVEGIEIVELQLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 
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PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid . 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C-Cyateine, D»Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HaHietidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
SoSerine, T=Threpnine, V«Valine, 
W-Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«pos3ible nucleotide insertion) 








SFKCEICNJKRYLRESAWKQHLNCYHLBEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFSCPWCHERFARNSTLKCH 
LTACQTG VG AKKGRKKLYECQVCNS VFNSWDQP KDHL VIHTGDK 
PNHCTLCDIiWFMQGNELRRHLSDAHNrSERLVTEEVLSVETRVQ 
T E PVTSMTI I EQVGKVHVLPLLQVQVDSAQVT VEQVHPDLLQDS 
QVHDS HMSELP EQ VQVS YLE VGRI QTEEGTE VHVEELHVE RVNQ 
MPVEVQTELLEADLDHVTPEIMNQEBRESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


2071 


qrsgaarlaflpspfspacvhrspLSphgcwpypwvfmplgvl 
fhrrrahgctlscssfveqptameaeetmeclqefpehhkmild 
rlneqreqdrftdituvdghhfkahkavlaacskffykffqef 

TQEPLVEIEGVSKMAFRHLIEFTYTAKLM1QGEEEANDVWKAAE 
FLQMLEAIKALEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TES L P S AESE P VEI EVE I AEGTI E VEDEG I ETL EE VASAKQSVK 
YIQS TGSS DDS ALALLADI TS KYRQGDRKGQI KE DGCPSD PTS K 
QVEGIEIVEliOLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTB 
SFKCEICNKRYLRESAWKQHLNCyHLEEGGVSKKQRTOKKIHVC 
QYCB KQFOHFGHFIQEHLIIKHTGE KPFE CPNCHE R FARNSTLKCH 
LTACQ TGVGAKKGRKKLYE CQVCNSVFNS WDQFKDHLVI HTGDK 
PNTHCTLCDLWFMQGNELRRHLSDAHNI SERLVTEEVLSVETRVQ 
TEP VTSMTI IEQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMSBLPEQVQVS YLE VGR IQTEEGTEVHVEELHVERVNQ 
MPVEVQTELliEADLDHVTPE 1 MNQEERE SSQADAAEAAREDHED 
AEDLETKPT VDSEAE KAENEDRTALP VLE 


6315 


1 


1015 


LGLAVNVVTTLVLI S YCPTATEEAPYWTYLLCAIiGLFI YQSLDA 
IDGKQARRTNSCS PLGELFDHG CDS LS TVFMAVGAS I AARLGTY 
PDWFFSCSFIGMFVFYCAHWQTYVSGMLRFGKVDVTBIQIALVI 
VFVLSAFGGATMWDYTIPILEI KLKI LP VLSFIiGGVI FSCSNYF 
HVILHGGVGKNGSTIAGTSVLSPGLHIGLII ILAIM1 YKKSATD 
VFEKH PCL Y I LNFG CVFAKVS QKLVVAHMTKSELYLQDTVPLGP 
GLLFLDQ YFNNF I DE YWLWMAMVI S S FDM VI YFS ALCLQ I S RH 
LHLNI FKTACHQAPEQVOVLSSKSHQNNMD 


6316 


1S03 


792 


VSAGAGTGI MGGTTSTRRVTFEADENEll I rVVKG I RLSENVIDR 
MKESSPSGSKSQRYSGAYGASVSDBEIiKRRVAEBLALEQAKKES 
EDQKRLKQAKELDRERAAANEQLTRAILRERICSEEERAKAKHL 
ARQLHEKDRVLKKQDAFYKEQIARLEERSSEFYRVTTEQYQKAA 
EEVEAXFKRYESHPVCADLQAKILQCYRENTHQTLKCSALATQY 
MHCVNHAKQSMLEKGG 


6317 


102 


339 


PSAQTS AVLAREKGHLPTMRH EAPMQMAS AQDAR YGQKDS S DQN 
FDYMFKLLI IGNSSVGKTSFIiFRYADDS FTSAFVSTVGIDFKVK 
TVFKNEKR I KLQI WDTAGQERYRTITTAYYRGAMG F I LMYD I TN 
EES FNAVQD WSTQI KT YSWDNAQVILVGNKCDMEDERVI STERG 
QHLGEQLGFEFFETSAKDNINVKQTFBRLVDIICDKMSESLETD 
PA I TAAKQNTRLKBTPPP PQPNCAC 


6318 


1765 


733 


PWHPI*RTLPLHHPHPRPPRAEGREGADSMSHLPGLEI>RREAPPL 
LGPLLS P F PLPAGS WHRQMLRSSLRFPI TWSAGAPCKAAGRMNI 
IJ^VRRDRVLAELPQCLRKEAALHGHKDFKPRVTCACQEHRTGT 
VGFKISKVrWGDLSVGKTCLlNRFCKDTFDKNYKATIGVDFEM 
ERFEVLG I PFSLQLWDTAGQERFKCIASTY¥RGAQAI I IVFNLN 
DVASLBHTKQWLADALKENDPSSVLLFLVGSKKDLSTPAQYALM 
EKDALQVAQEMKAEYWAVSSLTGENVREFFFRVAALTFEANVLA 
ELEKSGARRIGDWRINSDDSNLYLTASKKKPTCCP 


6313 ' 


38 


717 


AATMRLNQNTLLLGKKWLVPYTSEHVPSRYHEWMKSEELQRLT 
ASEPLTLEQEYAMQCSNQEDADKCTFrVLDAEKWQAQPGATEES 
CMVGDVNLFLTDLEDLTLGE IEVM IAEPS CRGKGLGTE AVLAML 
S YGVTTLGLTKFEAK IGQGNEPS I RMFQKLHFEQVATS S VFQE V 
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<AoAlanine, C-Cysteine, D-Aspartic Acid, B= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N«Asparagine, 
P=Proline, U^Glutamine, R=Arginine, 
S«Serine, r«Threonine, V* Valine, 
WaTryptophan, Y-Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVSESBHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGREKVAMAAVDS FYLLYRE IARSCWCYMEALALVGAWYTA 
R1CS I TVI CDFYSLI RLHFI PRLG SRADL I KQYGRWA WS GATDG 
IGKAYAEELASRGLNI IL I SRNE E KLQWAKDIADTYKVETDI I 
VAD PSSGRE I YLPI REALKDKDVGILVNNVGVFYPYPQYFTQIiS 
EDKLWDI I NVN I AAASLMVHWL PGMVERKKGA I VTI S SGSCCK 
PT PQLAAFS ASKAYLDHFS RALQ YEYASKG I FVQSLIPFYV7VTS 
MTAPSNFE»HRCS WLVPS P FCVYAHHAVSTLG I S KRTTG Y WSHS IQ 
FLFAQYMPEWLWVWGANILNRSLRKEALSCTA 


6321 


1418 


341 


HR KAALGALMAGRLLGKALAAVS LSLALAS VT I RSSRCRG IQAF 
RNS FSSSWFHIjNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
VPNEKVGWLVEWQDYKPVEYTAVSVLAGPRWADPQISESNF3PK 
FNEKDGHVERKS KNGLYE I ENGR PRN P AGRTG LVGRG L LGR WG P 
NKAADP 1 1 TRWKRDS SGNKIMHP VSGKH ILQFVAIKRKDCGEWA 
I PGGMVDPGEKI SATLKREFGEEALNSLQKTSAEKRB IEEKLHK 
LFSQDHLVIYKGYVDDPRNTDNAWMETEAVNYHDETGEIMDNLM 
LEAGDDAGKVKWVDINDKLKLYASIISQFIKLVAEKRDAHWS EDS 
EADCHAL 


6322 


2047 


10B3 


NQEILKNVESSRTVQPHFLEFLLSI*GWSVDVGRHPGWTGHVSTS 
WS INCCDDGEGSQQEEVISSEDIGAS I FNGQXKVLYYADAtjTEI 
AFWPSPVESLTDSLESNISDQDSDSNMD^MPGILKQPSLTLEI* 
FPNHTDNLNSSQRLSPSSRMRKLPQGRPVPPLGPETRVSVVWVE 
RYDD I ENFPLS ELMTB ISTGVETTANSSTSLRSTTLEKE VPVIF 
IHPLNTGI.FR I KIQGATGKFNMVI PLVDGM I VS RRALGFLVRQT 
VINICRRKRIiESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKMCSS 


6323 


1 


656 


PASTTDGAQBARVPLDGAFW I PRP PAGSPKGCFACVSKPPALQA " 
PAAPAPEPS AS P PMAPTLFPMES KS S KTDSVRAAGAP P ACKEOLA 
EKKTMTNPTTVIEVYPDTTEVND YYLWS I FNFVYLNFCCLGFI A 
LAYSLKVRDKKLLtTOLNGAVEBAKTDRLINlTRSGLAASCIMLW 
MALSVIATHRGLRSSASILVAEPHDWWTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQQ 
RP G PGAGAPAGRPEGGGP WARTEGSSLHSEPERAGliGPAPGTES 
PQAEFWTDGQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 
SELGTTCLWTETGTDGLWTDPHRSDLQFQPEBASPWTQPGVHGP 
WTELETHGSQTQPERVKSWADNliWTHQNSSSLQTHPEGACPSKE 
PSADGSWKELYTDGSRTOQDIEGPWTEPY'lUGSQKKQDTEAARK 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 
EPEPGELLTHLYSHLKCSPLCPVPRLIITPKTPEPEAQPVGPPS 
RVEGGSGGFSSASSFDBSEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYSPFWS FRKHYPWVQLSGHAGNPQAGEDGRr LKRFCQC 
EQRS LEQLMKDP LRPFVPAYYGMVLQDGO/T FNQMEDLLADFEG P 
SIMDC!01GSRTYIiEEELVKARHRPRPRKI)MYEKMVAVDPGAPTP 
E EHAQGAVTKP RYMQWRETMSSTSTLGFRI EGI KKADGTCNTNF 
KKTQALEQVTKVLEDFVDGD KVI LQKYVACLEELREALE I S PFF 
KTHEVVGSSLLFVHDHTGLAKVWMI DFGKT VALPDHQTLSHRLP 
NAEGNREDGYLWGLDNMICLLQGLAQS 


6325 


165 


944 


GLRDPFRRKRRLKPQVKMSNYVNDMWPGSPQEKDSPSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKY RR Y S RS YS RS R S RSRS RR YR ERR YG FTRR YYR S P S RYRS RS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRSRTPFRLSEKSRMELLEIAFOmAKALGTTNIDLPASJLRr 
VPSAKETS RG IGVSSNGAKPEVS I LGLS EQNFQ KANCQ I 


6326 


23B 


680 


GEPSPATQQKPSATGAGVLHQHFSSGHIYVLMGLIjPPPWTISFT 
VQTTLQPPGGLPAAPVSGRMAFBPVGRDLARRMVPRAGKRTQTL 
GARRVAAQGARPLPEDRRPKSGERLHVTVAPCWEFVLPSVSLTA 
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Glutamic Acid, F=Phenylalanine , G=*Glycine, 
H=Histidine, I=l8oleucine, K=Lysine, 
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W»= Tryptophan, Y«Tyrosine, X»Unknown, *=Stop 
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QAWGG VGQE AS SGVP 


6327 


1 


1337 


S LARIiAPAGGS VVMPTQQPAAPSTRAPKPSRsLSGSLCALFS DA 
DSGSGMXAELPPGPGAVGRBMTKEEKLQLRKEKKQQKKKRJOEEK 
GAEPETGSAVSAAQCQGPTRELPESGIQLGTPREKVPAGRSKAE 
LRAERRAKQEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPQVDDLLLRRLVKKPBRQQVPTRKDYGSKVSLFSHLPQYS 

TJOWQr.TftlTMCTDCCVTUOnM^rDT fwcmfT t rr> nrvTxn^'riiT t t» 

.Kyrou x y f Walroo V IJi^AMVKJLAaLQ Y SQGjjVRGSNARCIALLR 
ALQQVIQDYTTPPNEELSRDLVNKLKPYMSFIjTQCRPLSASMHN 
AIKFLNKE ITSVGS SKREEBAKSELRAAI DRYVQEKIVLAAQAI 
SRFAYQKISNGDVI L VYGCS S LVSRI LQEAWTEGRRFRVWVDS 
RPWLEGRHTLRSLVHAGVPASYLLIPAASYVLPEVSTEEKDSKV 
GGEKV 


6328 


1030 


276 


HAS AE VTTAAARGIiGAMEEEMHTDAK IRAENGTGS S PRG PGCSL 
RHFACEQMLLSR PDGSAS FLQGDTS VLAGVYGPABVKVSKE I FIT 
KATLEVII»R?KIGLPGVAEKSRERLIRNTCEAVVLGTLH PRTS I 
TWLQWS DAGSLLACCLNAACMALVDAGVPMRALFCGVACALD 
SDGTLVLDPTSKQEKEARAVLTFALDSVERiaLMSSTKGLYSDT 
ELQQCLAAAQAASQHVFRFYRESLQRRYSKS 


6329 


3 


2016 


SS EVAAGGGTRSAMAEGS GE WT VSATGAAKGLNNGAGGTSATT 
SMPLSRKLHKILETRLDNDKEMLEAIJCALSTFFVENSLRTRRNL 
RGDI ERKSLAINEEFVS I FKEVKEELES ISEDVQAMSNCCQDMT 
SRLQAAKEQTQDLIVKTTKLOSBSQKIjBIRAQVAIIAFLS KFQLT 
SD EMSLLRGTREG P T TEDFFKALGR VKQ I HNDVKVLLRTNQQTA 
GLEIMEQMALLQETAYERLYRMAQSECRTLTQESCDVSPVLTQA 
ME ALQDRP VLYKYTLD E FGTARRS TWRGF IDALTRGGPGGTPR 
PI EMHS HDPLRYVGDMIAWLHQATASE KEHLEALLKHVTTQG VE 
EMIQBWGHrTEGVCRPLKVRIEQVIVAEPGAVLLYKISNLLKF 
YHHTISGIVGNSATALLTTIEEMHIiLSKKIFFNSLSLHASKLMD 
KVELPPPDLGPSSALNQTLMLLREVLASHDSSVVPLDARQADFV 

E FTDRRLEMLQ FQI EAH LDTI»IN E QAS YVLTRVGLS Y I YNTVQ Q 
HKPEQGSLANMPNLDSVTLKAAMVQFDRYLSAPDNLL IPQLNFL 
LSATVKEQIVKQSTBLVCRAYGEVYAAVMNPINEYKDPENILHR 
SPQQVQTLLS 


6330 


1151 


333 


FFYYTFYENKTFSRKMVAEKBT1.SLNKCPDKMPKRTKLLAQQPI, 
P VHQPHS LVS EGFTVKAMMKNS VVRGP PAAGAFKERPTKPTAFR 
KFYERGDFPIALEHDSKGNKIAWKVEIEJOiDYHHYLPLFFDGLC 
EMTFPYEFFARQGIHDMLEHGGNBCILPVLPQLI IPIKNALNLRN 
RQVI CVT LKVLQHL WSAEMVG KALVP YYRQ ILP VLN I FKNMNV 
NSGDGIDYSQQKRENIGDLIQETLBAFERYGGENAFINIKYWP 
TYESCLLN 




3 


455 


QWQRVRTRGRRACASATPLEGCVDLSYPRTHAALLKVAQMVTL 
L IAF I CVRSS LWTNYSAYS Y FEWT I CDL IM I IxAFYLVHLFRFY 
RVLTCISWPLSELLHYLIGTLLXiLIASIVAASKSYNQSGLVAGA 
I FGFMATFLCMASI WLS YKI SCVTQSTDAAV 


6332 


1 


"■' a? 8 


.VTESNKFDLVSFIPLLRERIYSNNQYARQPIISWILVLESVPDI 
NLLDYLPEI LDGLFQ ILGDNGKEIRKMCEWLGEFLKE I KKNPS 
S VKFAEMAN I LVIHCQTTDDLIQLTAMCWMREF I QLAGR VMLP Y 
SSGILTAVLP CLAYDDRKKS IKEVANVCNQSLM KLVTPEDDELD 
ELRPGQRQAB PTPDDALPKQEGTAS GEWTPSLHLTSCRG PREPD 
VIGVAU3PHLSNQDYFMYVTHTIVAATQRSGSSGSPPFCRQDTG 
KLSTMATHS QLVKTGTGLB PRQAVSS SH 


6333 


3 


1467 


TRTP SEAEAGGES PQS C VSAAHS DWTAGKPVSLLAPLI p PRSAG 
QPtTFSPSGRQPLRSLLVGMCSGSGRRRSSLSPTMRPGTGAERG 
GLMMGHPGMHYAPM®1HPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPGMMMSHMSQASMQPALPPGVNSMDVAAGTAS 
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<A»Alanine, C=Cysteine, D=Aspartic Acid,' Ee 
Glutamic Acid, F* Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine ( 
P=Proline, Q=*Glutamine, R-Arginine, 
S»Se rine, T=Threonine, VaValine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *«=stop 
Codon, /^possible nucleotide deletion, 
Xapossible nucleotide insertion) 








GAKSHWTEHKSPDGRTYYYNTBTKQSTWEKPDDLKTPAEQLtSK 
CPWK3yKSDSGKPYYYNSQTKBSRWAKPKELEDLEGYQNTIVAG 
SL1TXSNLHAMIKAEBSSKQEECTTTSTAPVPTTEIPTTMSTMA 
AAEAAAAWAAAAAAAAAAAAANAHASTSASNTVSGTVPVVPEP 
E VTS I VATWDNENTVTI STEEQAQLTSTPAIQDQS VEVS SNTG 
EBTSKQETVADPTPKKEEEESQPAKKTYTWNTKEEAKQAFKEIiL 
KB KRVPSNAS WEQAMKMI INDPRYSALAKLSEKKQAFNAYKVQT 
BKK 


6334 


. 17 


644 — 


GGNPSGRAAGFAAAANPSSPLRVAWCSSNQNRSMEAHN1LSKR 
G PS VRS FGTGTHVKL PGPAPDKPNVYDFKTTYDQMYKDLLRKDK 
EL YTQNG IIiHM LDRNKR I KPR PERFQNCKDLFDL ILTCB ERVYD 
Q WED LNSREQETCQPVH VVNVDI QDNHEEATLGAFL I CE LOQC 
IQHTEDMENE IDELLQEPEEKSGRTFLHTVCFY 


633S 


82 


529 


AARARPGVLCCRIjLGAALGDQSRVEMS YI PGQ P VTAWQRVEIH 
KLRQGENLILGFSIGGGIDQDPSQNPFSBDKTDKGIYVTRVSEG 
GPAEIAGLQIGDKIMQVNGWDMTMVTHDG^RKRLTKRSEEVVRL 
LVTRQS LQKAVQQSMLS 


6336 


1003 


438 


HEPASKGRAKVGNMRLSVAAAISHGRVFRRMGLGPESRIHLLRN" 

LLTGLVRHER I EAP warvdemrg yaeklidyg klgdtneramrm 

AD FW LTE KDL I P KL PQ VLA PR YKD QTGGY TRM LQ I PNR SLDRAK 

MAVIEYKGNCLPPPPLPRRDSHLTLLNQLLQGLRQDLRQSQEAS 
NHSSHTAQTPGI 


6337 


76 


524 


EG I QMIjS VQPDTKP N3 CAG CNRKI KDRYLLKALDKYWHEDCLKC ' 
AC CD CRLG EVG S T LYTKANL I LCRRD Y LRL FGVTGN CAACS KL I 

PAPEMVMRAKDNVYHLDCFACQLCWQRFCVGDia?FI.KWNMILCQ 
TDYEEGLMKEGYAPQVR 


6338 


66 


1349 


APNSESGTQGPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP 
GLRLALLLLLGLGTPKSGVQGQEGLDFPEYCGVDRVINVNAKNY 
KNVFKKYEVLALLYHEPPEDDKASQRQFEMEELI LELAAQVLED 
KGVGFGLVDSBKDAAVAKKLGLTEVDSMYVFKGDEVIEYDGEFS 
ADTIVEFIaLDVLEDPVELI EGERELQAFENIEDE I KLIGYFKS K 
DSEH Y KAFEDAAEEFHPYI P FFATFDS KG AKKLTLKLNE I D FYE 
AFMEEPVTI PDKPNSEEEIVNFVEEHRRSTLRKLXPESMYETWB 
DDMDG IH I VAEAEEADPDGFE FLETLKAVAQDN TENP DLS 1 1 WI 
DPDDFPLLVPYWEKTFDIDLSAPQIGWNVTDADRLWMEMDDEE 
DLPSAEELEDWLEDVGEGE INTEDDDDDDDD 


6339 


24 6 


-LB -3 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHS FFSQGAMKAFH 
TFCWLLVFGSVSEAKFDDFEDEEDIVEYDDNDFAEFEDVMEDS 
VTESPQRVIITEDDEDETTVELBGQDENQEGDFBDADTQEGDTE 
S EP YDDEE FEG YEDKPDTSS S KNKDP I X I VD VPAHLQNSWES YY 
LEILMVTGLIAYIMNYIIGKNKNSRIJ\QAWFNTHREIXESNFTL 
vuuiJUlwltfcATSTGKIiNQENEHIYNLWC^GRVCCEGMLIQLRFL 
KRQDLLNVLARMMRPVSDQVC? I KVTMNDEDMDTYV FAVGTR KAL 
VRtXJKEMQDIiSE PCS DKPKS GAXYGL PDSLAILS EMGEVTDGMM 
DTKMVHFLTHYADKI E S VHFSDQFSGP KI MQEEGQPLKLPDTKR 
TLLLTFNVPGSGNT YP KOT1B ALLPLMNMVI YSI DKAKKFR LNRE 
GKQKADKNRARVEENPLKLTHVQRQEAAQSRREEKKRAEKERIM 
HEEDPEKQRRt*EEAALRREQKKLEiOXQMKM:<QIKVKAM 


6*340 


2 


583 


EACAHTLSCPAFARLGRARRRPWMSHRTS STFRAERS FHSSSS S 
SSSSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSEPLAF 
PARPGGAGNI KTLGDAYE FAVD VRDFS PED 1 1 VTTSNNH I EVRA 
EXLAADGTVMNNFAHKCQLPEDVDPTS VTSALREDGS LTIRARR 
HPHTEHVQQTFRTBIKI 


6341 


2 


&4S " " 


KMAVLSAPGLRGFR I LGLRSS VGPAVQARG VHQSVATDGPSSTQ 
PALPKARAVAPKPS SRGB YWAKIiDDLVNWARRSS LW PMTFGLA 
CCAVEMMHMAAPRYDMDRFGVVFRASPRQSDVMIVAGTLTNKMA 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, B» 
Glutamic Acid, P= Phenylalanine, G-Glycine, 
H=Histidine, I*Isoleucine, K^Lysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine, TsThreonine , VeValine, 
^Tryptophan, YoTyrosine, X=tfnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
» wiun; iiucieutiue inssrcioD) 








PALRKVYDQMPBPRYWSMGSCANGGGYYkYSYSWRGCDRIVP 
VDIYIPGCPPTAEALLYGILQLQRXIKRERRLQIWYRR 


6342 


2 


1191 


dprvramlatzj^vaai^ktclfsgrgggrglWtcrpqsdmnni 

KPLEGVKI IiDLTRVLAG P FATMNLGDLG AEVI JCVERP GAGDDTR 
TWGP PFVGTEST YYLS VNRNKKS IAVNI KDPKGVKI I KELAAVC 
DVFVENYVPGKLSAMGLGYEDIDEIAPHIIYCSITGYGQTGPIS 
QRAG YDAVASAVSGLMHI TGPE VACLSHlAANYItlGQ KEAKKWG 
TAHG S I VPYQAPKTKDGYI WGAGNNQQFATVCKILDLPELIDN 
SKYKTNHLRVHNRKELI KI LS ERFEE ELTSKWLYLFEGS G VP YG 
PIIWMK3^AEPQVLHNGLVMEMEH?TVGKISVPGPAVRY5KFK 
MS EAR PP PLLGQHTTH I LKE VLRYDDRAIGELLS AG WDOHETH 


6343 


2 


936 


G'J' AM VSDHDELNLLVI WDANP I W WGKQAL KES Q FTLS KC I DAV 
MVI^SHLFMNRSNKLAVIASHIQESRFLYPGKNGRLGDPFGDP 
GNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMTKSDIKGQHT 
ETLIAGSLAKALCYIHRMNKEVKDNQEMKS R I L V I KAAEDS ALQ 
YMNFMNVIFAAQKQNILIDACVLD5DSGLLQQACDITGGLYUCV 
PQMPSLLQYLLWVFLPDQDQRSQLILPPPVHVDYRAACKCHRNL 
IB IGYVCS VCLS I FCNFS P I CTTCErAFKXSL PP VLKAKXKKLK 
VSA 


6344 


2508 


147 


TMPTATLGNI^GYGMASPGLAAPSLTPPQLATPNLQQFFPQM^ - ^ 
QSLLGPPPVGVPtvWPSQFNLSGRNPQKQARTSSSTTPNRKDSSS 
QTMPVEDKSDPPBGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCBASELPAIGUiRSSBBPTEKEPPGQLQVKAOPOARMT 
VPKQTQTPDLLPE AliEAQVljpRFQ PRVLQVQAQVQS QTQPRI PS 
TDTQVQPKLQKQAQTQTSPEHLVLQQ KQVQPQLQQ2AE PQKQVQ 
PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQLQKQVQTQT YPQVHTQAQ PS VQPQ EHPPAQVS VQPPEQTHEQ 
PHTQPQVSLLAPEQTPVVVHVCGLEMPPDAVEAGGGMEKTLPEP 
VGT Q VS M EEI QNES ACGLD VGE CEN RAR EMPG VWGAGGS L KVTI 

LQSSDSRAFSTVPLTPVPRPSDSVSSTPAATSTPSKQALQFFCY 
ICKASCSSQQBFQDHMSEPQHQQRLGEIQHMSQACLLSLLPVPR 
DVLETEDEEP PPRRWCNTCQLYYMGDLIQHRRTQDH KI AKQSLR 
PFCTVCNRYFKT P R K FVE HVKSQGHKDKAKELKS LEKE I AGQDE 
DHFITVDAVGCFEGDEEEBBDDEDEEEIEVEEBLCKQVRSRDIS 
REEWKGS ETYS PNTAYGVDFLVP VKG YI CRICH KFYHSNSGAQL 
SHCKS LGHFENLQKYKAAKNPSPTTRP VS RRCA INARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 


*345 


2 


3483 


PRVRTKLILIiVNDKKRYERVGGGPKRLGRDVEMEEMIEQLOEKV 
HELEKQNDTLKNRL I S AKQQLQTQGYRQT PYNNVQSR INTGRRK 
ANENAGLQECPRKG I KFQDAD VAETPHPMFTKYGNSLL EE ARGE 
IRNLEWVI QSQRGQ I EELEHLAE ILKTQLRRKENEI ELSLLQLR 
EQQATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGKFIQLQEK 
QRTLKISHDALMAWGDELNMQLKEQRLKCCSLEKQLHSMKFSER 
RI BELQDR INDLE KERELLKENYDKL YDSAFSAAHEEQ WKLKEQ 
QLKVQIAQLETALKSDLTDKTEILDRLKTERDQNEKLVQENREIi 
QLQYLEQKQQLDELKKRIKLYNQEND INADELSEALIiLIKAQKE 
QKNGDLS FLVKVDS BINKDLERSMRELQATHAETVQEL3KTRNM 
LIMGJJKINKDYC^EVEAVTRKMEKLOXJDYELKVEQYVHLLDIRA 
AR IHKLEAQLKDI AYGTKQYKFKPE IMPDDS VDEFDE T I HLERG 

ENLFEIHINKVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTPV 
VRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITLBVHQAYSTEYE 
TIAACQLKFHEILEKSQRIFCTASLIGTKGDIPNFGTVEYWFRL 
RVPMDQAIRLYRERAKALGYITSMFKGPEHMQSLSQQAPKTAQL 
S STDSTDGNLNELH ITI RCOfflLQSRASHLQPHP YVVYXFTDFA 
DHDTAI I PS SNDPQFDDHMYF PVPMNMDLDRYLKSBSLS FYVFD 
DSDTQENIYIGKVNVPLISLAHDRCISGIFELTDHQKHPAGTIH 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
<A= Alanine, C-Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K« Lysine, 
L=Leucine, MoMethionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=*Arginine, 
S=Serine, ToThreonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 








VILKWKFAYLPPSGSITTEDLGNFIRSEEPEVVQRLPPASSVST ' 
L VLAPR PKPRQRLTPVD KKVS PVD I MPHQS D VSQEGS VDE VKEN 
TB KMQQGKDD VSLLS EGQLAEQ SLAS SEDETE I TEDLE PEVB ED 
MSASDSDDCIIPGPISKNIKQPSEKiRIEIIALSLNDSQVTMDD 
TIQRLPVECRFYSLPAEETPVSLPKPXSGQWVYYNYSNVIYVDX 
ENNKAKRDI LKAI LQKQ EMPNRS LR PTWS DP P EDEQ DLBCED I 

GVAHVDLADMPQEGRDLIEQNIDVFDARADGEGIGKLRVTVEAL 
HALQSVYKQYRDDLBA 


6346 


2921 


533 


QDRRLLRLELQKTCQFTSTMSGSHTPACGPFSALTPSIWPQEIL 
AKYTQK EES AEQ PE FY YDE FG PR VYKEEGDEPGSS LLANS PLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
G I PHGMRPQLWMRLS GALQKXRNS ELS YRE IVKNS SND BT I AAK 
QIEKDLLRTMPSNACFASMGSIGVPRI^RVLRALAWLYPEIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAIIEDLLPASYFSTTLLGVQ 
TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASW 

DIKLIlLRIWDIiFPYFRSRVT.U'nT.TT/iMT ur trCDt?T rnoriTnn <t t 

FNTLS DI PS QMEDAELLLG VAMRLAGS LTDVAVETQRRKH LAYL 
IADQGOLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTELVADLREAI LRVARHFQ CTDPKNCS VVSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSME5HQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDII TI VSQKDEHCWVGELNGL 
RGWFPAKFVE VLDERSKE YS IAGDDS VTEG VTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
ICVGLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRSPGWVQIKC 

BLRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6347 


2921 


533 


QDRRLLRLELiQKTCQPTSTMSGSHTPACGPFSALTPSlrtPQEIIi 
AKYTQKE ESAE QP E FY YDS FG FR VYKEEGDEPGSS LLANS PLME 
DAPQRIjRWQAHLE FTHNHDVGDLTWD KI AVSLPRS eklrs lvla 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRALAWLYPEIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAI I EDLLPAS YFS TTLLGVQ 
TDQRVLRHLI VQ YLPRLDKLLQEHDI ELS L X TLHWFLTAFAS W 
DI KLLLR I WDLFFYEGSRVLFQLTLGMLHLKEEELIQSEKSAS I 
FNTLS DI P S QMEDAE LLLGVAMRLAG S LT DVAVETQ RR KHLAYL 
I ADQGQLLGAGTLTNLSQWRRRTQRRKSTI TALLFGBDDLEAL 
KAKNI KQTELVADLREAI LRVARHFQCTDPKNCSWSRQLPGLL 
PNTALTP PTPL VGLYSLWQE LTPD YSME SHQR DHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAK FVE VLDBRS KEYS IAGDDS VTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
C KTFR LDED G KVLT P E Eli YRAVQS VNVTHDAVHAQM DVKLRSL 
I CVT3LNEQVLRLWLBVLCSSLPTVEKWYQPWSFLR3PGV7VQIKC 

ELRVLCCFAPSLSQDWELPAKREAQQPLKBGVRDMLVKHHLFSW 
DVDG 


6348 • 


3 


U13 


AGASKCFVTbLACFLAKQQtfkYKYEECKDL I KSMLRNELQFKBE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLS PENDN 
DDDEDVQVE VAEKVQKS SS PREMQKAEEKEVPEDSLBECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYBECKDL IKFMLRN 
ERQFKEBKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQIiAEGCRLAQHLVQK 
LS P ENDNDDDBDVQVE VAEKVQKS SAPREM P KAEEKE VPEDSLE 
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SEQ 
ID 
NOs 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segmenc containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionane, N»Asparagine, 
P»Proline, Q=Glut amine, R=*Arginine, 
S=Serine, TeThreonine, V-Valine, 
WsTryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








ECAITC5NSHGPYDSNQPHRKTKITFKEDKVDSTLIGSSSHVEW 

BDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEBVPOESWDEG 

YSTLSIPPEMLASYKSYSSTPHSLEEQQVCMAVDIGRHRWDQVK 

KEDH2ATGPRLSRELLDEKGPBVLQDSLDHCYSTPSGCLBLTDS 

CQP YRSAFYVLEQQRVGLAVNMDEI SKYQBVEEDQDPS CPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEB 

QYLGLALDVDRI KKDQEEEEDQGP PCPRLSRELLEWE PEVLQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALBEKHVGFSLDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDS LDRCYS TPSG CLELTDS CQPYRSAF Y I LEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDBKEPEVLQESLDRCYS 

TPSGCLBLTDSCQPYRSAFYILBQQRVGIAVDMDEIEKYQEVEE 

DQDPSCPRLSRELLDEKE PEVLQDS LGRCYS TPSG YLELPDLGQ 

PYSSAVYSLEEQYLGLALDVDRIKKDQBEEEDQGPPCPRiSREL ' 

LEWEPEVIiQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 

RLNS MLMEVEE PEVLQDSLDI CYSTP SMYFELPDS FQHYRS VFY 

SFEEEHISFALYVDNRFFTLTVTSLHLVFQMGVIFPQ 


6349 


3 

• 


3679 


AGAEKCFVTliLACFLAKQQNKYKYEECKDL I KSMLRNELQFKEE 

XLAEQLKQABELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EIILQALIiT&DEPDKSQGQDLQEQLAEGCRLAQHLVOKLSPENDN 

DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 

NSHGP CDSNQ PHKNI K ITFEEDE VNS TLWDRE S S HDECQDALN 

iLPVPGPTSSATNVSMWSAGPIiSGEKAAINILEINEKLRPQIA 

EKKQQFRNLKBKCFLTQIACFLANQQNKYKYEECKDLIKFMLRN 

ERQFKEEIOAEQLKQAEBLRQYKVLVHSQERELTQLREKLREGR 

DASRSLtfEHLQALLTPDBPDKSQGQDLQBQIABGCRLAQHIiVOK 

LS PENDNDDDBD VQ VEVAEKVQKSS APRBMPKAEEKE VPEDS LE 

ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESBEEEVPQESWDEG 

YSTLSI PPEMLASYKS YSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHEATGPRLSREIiLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

QYI^LALDVDRIKKDQEEEEDQGPPCPRLSRBLLEWEPEVLOD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALBEKHVGFSLDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEBDQNPPCPRLSRELLDBKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDE I EKYQEVEEDQDPS CPRLSG BLLDE KE PEVLQES LDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEB 

DQDPSCPRLSRELLDEKEPBVLQDSLGRCYSTPSGYLBLPDLGQ 

PYSSAVYSLEEQyLGLALDVDRlKKDQEEEEDQGPPCPRLSREL 

LEWE PEVLQDS LDRCYST PS SCLEQPDS CQP YGSSFYALEEKH 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 

RLNSMI^EVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 

o c aactti x orAux VDNKr FTLTVTSIiHLVFQMGVIFPQ 


6350 


3 


3679 


AGAEKC FVTLLACFLAJCQQNKY KYEECKDLl KSMLRNELQFKEE 
KLAEQLKQAEE LRQ Y KVLVHSQ ERELTQLREKLREGRDAS RSLN 
EHLQALLTPDEPDKS QGQDLQEQLAEG CRLAQHL VQKLS P ENDN 
DDD2DVQVEVAEKVQKSSSPREMQKAEBKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANCXSNKYKYEECiCDLIKFMLRN 
BRQ FKEEKLAEQLKQAEELRQ YKVLVHSQBRELTQLRE RLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSP ENDNDDDEDVQVEVAE KVQKS SAPREMPKAEEKBVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D^Aspartic Acid, E» 
Glutamic Acid, F»Phenylalanine, G=Glycine / 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Iieucine, M=Methionine, N=5Asparagine, 
P=Proline, Q«Glutaraine, R=Arginine, 
SoSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrooine, X~Unknovm, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








E CAITCSNSHGP YDSNQPHRKTK It TFEED KVDS TL I G S S SHVE W 
EDAVHIIPENESDDEEEEBKGPVSPRNLQESEBBEVPQESWDBG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQPYRSAFYVLBQQRVGLAVW5DEIEKYQE VEEDQD P SCPRLSR 
ELLDEKBPEVLQDSLGRCYSTPSGYLBLPDLGQPYSSAVYSLEE 
QYLGLALDVDRIKKDQBEEBDQGPPCPRLSRELLEWEPEVLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGPSLDVG2IE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PEVLQDSLDRCYSTPSGCLBLTDSCQPYRSAFYILEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQP YRS AF Y I LEQ QRVGLAVDMDE I E KYQEVEE 
DQD PSCPRLSRE LLDEKEP EVLQDS LGRCYSTPSGYLBL P DLGQ 
PYSSAVYS LEEQ YLGLALDVDR I KXDQBEEEDQGPPCPRLSREL 
I»E WEPE VLQDSLDRC YSTPSS CLEQPDSCQP YGS SF YALEEKH 
VGPSLDVGE IEKKGKGKKRRGRRS KKERRRGRKEGEEDQNPPCP 
RItNSMLMBVEBPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
SFEEEHISFALYVDNRFFTLTVTSLHLVPQMGVIFPQ 


6351 


1291 


319 


RKARRRTERSQLGRMLWEV7^NGRSLVWGAEAVQALRERLGVGG 
RTVGAIiPRGP RQNSRLG LPLLLMPEBARLLAB I GAVTLVS APRP 
DSRHHSLALTSFKRQQBESFQEQSALAAEARETRROELLEKITE 
GQAAKKQKLEQASGASSSQBAGSSQAAKEDETSDGQASGEQyEA 
G FSS SQAGPSNGVAPLPRS ALLVQLATARPRp VKAR P LDWRVQS 
KDWPHAGRPAHELR YS I YRDLW ERG FFLS AAGKFGGD FL VYPGD 
PLRFHAHY1 AQCWAPEDTI PLQDLVAAGRLGTS VRKTLLLCS PO 
PDGKWYTSLQWASLQ 


6352 


235 


923 


WSBWLSPCHAAKCKGLSMLRITMKTRAXSLAADATBFVQGRSAP 
AMARSLVHDTVFYCLSVYQVKISPTPQliGAASSABGHVGQGAPG 
LMGNMNPEGGVNHSNGMNRDGGMI PEGGGGNOE PRQQPQP PPEE 
PAQAAMEGPQPENMQPRTRRTKFTLLQVBELESVFRHTQYPDVP 

rRRELAENLGVTEDKVRVWFKNKRARCRRKQRELMIJ\NELRADP 
DDCVYIWD 


6353 


65 


672 


K tfAGAGA I PfcAKAR PPD VQAAEEEKEMDLPDSAS R VFCGR IIiS M 
VNTDD VNAI I LAQ JCNML DRFE KTNEMLLNFNNL S S ARLQQMS ER 
FLHHTRTIjVEMKRDLDS I FRRIRTLKGKLARQHPEAFSHIPEAS 
FLEBEDEDPIPPSTTTTIATSEQSTGS CDTSPDTVS PSLSPGFE 
DLSHVQPGSPAINGRSQTDDBEMTGE 


6354 


965 


510 


PSLRPMEPTRDCPLFGGAFSA1LPMGAIDVSDLRPVPDNQEVFC 

HPVTDQSLIVELLELQAHVRGEAAARYHFEDVGGVQGARAVHVE 

SVQPLSLENLALRG RCQEAW VLSGKQQ IAKENQQVAKDVTLHQA 
LLRLPQYQTDLLLTFNQPP 


6355" " 
6356 " 


158 


1662 


RGSSAAFRGSGbRGAMIRRVLPHGMGRGLLTRRPGTRRGG'FSLD 
WDGKVSEIKKKIKS ILPGRSCDLIiQDTSHLPPBHS DWI VGGGV 
LGLS VAYtf liKKLES RRGAIRVIiWERDHTYSQAS TGLS VGG I CQ 
QFSLPENI QLSLFSAS FLRN'NE YLAVVDAPPLDLRFNPSGYLL 
LAS EKDAAAMESNVKVQRGEGAKVSLMS PDOLBN V wtwt r»r> i r 
ALASYGMEDEGWFDPWCLLQGLRRKVQSLGVLFCQGEVTRFVSS 
SQRMLTTDDKAVVLKRIHBVHVKMDRSIiEYQPVECAIVINAAGA 
WSAQIAALAGVGEGPPGTLQGTKLPVEPRKRYVYVWHCPQGPGL 
BTPLVADTSGAYFRREGLGSNYLGGRSPTEQEEPDPANLEVDHD 
FFQDKVWPHLALRVPAFETLKVQSAWAGYYDYNTFDQNGWGPH 
PLWNM Y FATGFSGHGLQQAPGIGRAVAEMVLKGRFQT I DLS P F 
LFTRFYXiGEKIQENNI I 




354 


" £33 

: 


TGLTSSCLPLgVMMTKRTKDMGKFSSVTVSTIDEEEEEIBAREV 
ADS YAQNAKV I EKQL E RKGMS KRRLQELABLEAKKAKMKGTLID 
fJQFK 



492 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


1 Predicted 
beginaing 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
U-Alanine, C-Cysteine, D«Aepartic Acid, E=» 
Glutamic Acid, P= Phenylalanine, G-Glycine, 
H«Histidine, I=Isoleucine, K=»Lysine, 
L=Leucine, M-Methionine, N»Asparagine , 
rartoiine, u=ciucanu.ne, R=Arginine, 
S= Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 1 


6357 


2 


91S 


GLLRNMALLVRVLRNQTS ISQWVP VCSRL I PVSPTQGQGDRALS ' 

RTSQNPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDSP 

QPVBEKVGAPTKIIBAMGFTGPLKYSKWKIKIAALRMYTSCVEK 

TDFEEFFLRCQMPDTFNSWFLITLLHVWNCLVRMKQEGRSGKYM 

CRI I VHFMWEDVQQRGR VMGVNPY IIjKKNMI LMTNHF YAAI LG Y 

DEGILSDDHGLAAALWRTPFNRKCEDPRHLBLLVEYVRKQIQYL 

DSMNGEDLLLTGEVSWRPLVEKNPQSIIiKPHSPTYNDEGL 


6358 


2009 

* 


1040 


ASDALHSLSAPVXiRLSSRSAARPATMTEQAI SFAKD FLAGGIAA 
Al SKTAVAPIERVKLLLQVQHASKOIAADKQYXGI VDCI VRI PK 
EQGVliS FWRGNLANVIRYFPTQALNFAFKDKYKQ1 FLGGVDKHT 
QF WR YFAGNIiAS GGAAGATSI.C FVYPLDFARTRLAADVGKSGTE 
RB FRGLGD CLVK I TKSDG IRGL YQG FS VSVQGI 1 1 YRAAYFGVY 
DTAKGMLPDPKNTHIWSWMIAQTVTAVAGVVSYPFDTVRRRMM 

MQSGRKGADIMYTGTVDCMRKIFRDEGGKAFFKGAWSNVLRGMG 
GAFVLVLYDELKKVX 


6359 


98 


1086 


VCRQEEEKMKEDCIiPSSHVPISDSKSIQKSELLGLLKTYNCYHE 
GKSFQLRHRBEEGTL 1 1 EGLLN I AWGLRRP IRLQMQDDREQVHL 
PSTSWMPRRPSCPLKEPSPQNGNITAQGPSIQPVHKAESSTDSS 
GPLEEAEEAPQLMRTKSDASCMSQRRPKCRAPGEAQRIRRMRFS 
INGHFYNHKTSVFTPAYGSVTNVRVNSTMTTLQVLTLLLNKFRV 
EDGP SEFALYI VHESGERTKLKDCE YPLISR I LHGPCE KI AR I F 
LMEADLGVEVPHEVAQYI KFEMPVLDSFVEKLKEEEERE I IKLT 
MKFQALRIiTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTLEE WIjP PR S CRVFW I HSGTTMS KVS FKI TLTSDP " "j 
RLPYKVLSVPESTPFTAVLXFAABEFKVPAATSAIITKDGIGIN 
PAQTAGNVFLKHGSELRI IPRDRVGSC 


6361 


615 


158 


RPGLGQLQHCAIAPQAGNRRCRFHGRLHALTRSTHRGKPMSIMQ '" 
FKDTLNTPLPDS S P VAVPLGAP1 AVASTLS VEHNDGVETOI WAC 
APGRWRRQITSQBFCHFIQGRCTFTPDDGETLHIQAGnALMLPA 
NSTGIWDIQETVRKTYVIilL 


6362 


350 


1576 


TTMDGSHSAALKLQQLPPTSSSSAVSEASFS YXENLIGALLAI F 
GHLWS IALRLQKY CH I RLAGS KD PRAY FKTKTWWLGLFLMLLG 
EUSVFAS YAFAPLSLTVPLSAVSVIASAIIG I IFI XEKWKPKDF 
LRRYVLS FVGCG1AWGTYLLVTFAPNSHEKMTGENVTRHLVSW 
PFLLYMLVEIILFC^LLYFYKBKNANNrVVrLLLVALLGSMTVV 
T VKAVAGMLVLS I QGNLQLD Y P I FYVM FVCMVATAVYQAA FLS Q 
ASQMYBSSLIASVGYIIjSTTIAITAGAIFYLDFIGEDVLHICMF 
ALGCL IAFIjGVFL I TRNRKK? I PFE P YI SMDAMP GMQNMHDKGM 

TVQPBLKASFSYGALENNDNISEIYAPATLPVMQEEHGSRSASG 
VPYRVLEHTKKE 


6363 


21 


1201 


RRTRLGSS FPRRRDS SAMES YDVXANQP WIDNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLS I 
RYPMEHGIVKDWNDKERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFI SMQAVLSLYATGRTTG WLD 
SGDGVTHAVPIYEGFAMPHSIMRIDIAGRDVSRFLRLYLRfCEGY 
DFHSS S EF E IVKA I KERAC Y1*S INPQKDSTLETBKAQY YIiPDGS 
TI BIGPSR PRAPELLFRPDLIGEF.SEGIHEVLVPA1 QKSDMDLR 
RTLFSNI VLSGGS TLFKGFGDRLLSE VKKLAPKDVK I RX S APQE 
RLYSTWIGGSI LASLDTFKKMWVSKKE YEEDGARS IHRKTF 


63 64 


21 


1201 


RRTRLGSS FPRRRDS3AMESYDVIAHQPWIDNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALBGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSIODQLQTFSEEHPVLLTEAPli 
NP3 KNRERAAEVFFET FNVPALFISMQAVLS LYATGRTTGWLD 
SG DGVTHA VPI YEG FAMPHS I MR I D I AGRD VSRFLRL YLR KEG Y 
DFHSS SEFEIVKAI KERAC YLS INPQKDETLETBKAQYYLPDGS 
TISIGPSRFRAPBLLFRPDLIGEBSEGIHEVLVFArOKSDMDLR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C-Cyeteine, D=Aspartic Acid, B= 
Clutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=sl>ysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQfe 
RLYS TWIGGS ILASLDTFKKMWVSKKEYEEDGARSIHRKTF 


6365 


234 


1993 


KH KSRASCAARAQAFG PS REREVHS RFRSGLRRLGESNSGCCTM 
ASMGTLAFDE YGRPPLI I KDQDRKSRLMGLEALKSHIKAAKAVA 
NTMRTS IiG PNGLDKX MVDKDGDVT VTNDGAT I LSMMD VDHQ I AK 
LMVELSKSQDDE IGDGTTGVWLAGALLEEAEQLLDRGIHP IRI 
ADGYEQAARVAIEHLDKISDSVLVDIKDTEPLIQTAKTTLGSKV 
VKSCHRQMAE IAVNAVLTVADMBRRDVDFELIXVEGKVGGRLED 
TKLIKGVTVDKDFSHPQMPKKVBDAKIAILTCPFEPPKPKTKHK 
LDVTSVEDYKALQKYKKBKFEEMIQQIKBTGANLAICQWGFDDE 
ANHLLLQNNLPAVRMVGGPEIELIAIATGGRIVPRFSELTAEKL 
GFAGLVQEISFGTTKDKMLVIEQCKNSRAVTIFIRGGNKMIIEE 
AKRS LHDALCVI RNLIRDNRWYGGGAAEIS CALAVSQEADKCP 
TLEQYAMRAFADALEVI PMALSBNSGMNPIQTMTEVRARQVKEM 
NPALGIDCLHKGTNDMKQQHVIETLI GKKQQ ISLATQMVRM ILK 
IDDXRKPGESEE 


6366 


257 


189B 


GNKEGAHSSTFWVLLSIFLGAVAMLCKEC^ITVLGLNAVFDiLV 
IGKFNVLEIVQKVLHKDKSLENLGMLRNGGLLFRMTLLTSGGAG 
MLYVRWRIMGTGP PAFTEVDNPAS FADSMLVRAVNYNYY YSLNA 
WLLLCPWWLCFDWSMGCI PLI KS ISDWRVI AliAALWFCLI GLIC 
QALCSEDGHXRR I LTLG LGFLVI PFL PASNLFFRVG FWAERVT* 
YLPSVGYCVLLTFGFOALSKHTKKKKLIAAWLG I LFINTLRCV 
LRSGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRLNPKYVHAMNNIiGNILKERNELQFJ^ELLS LAVQIQ 
PDFAAAWMNLGIVQNSIjXRFBAAEQSYRTAIKHRRKYPDCYYNL 
GRLYADLNRH VDALNAWRNATVLKPEHSIaAWNNM 1 1 LLDNTGKI* 
AQAEAVGREALEL I PNDHSLMFSLANVLGKSQKYKESEALFLKA 
IKANPNAASYHGNLAVLYHRWGHLDLAKKHYEISLQLDPXASGT 
KENYGLLRRKLBLMQKKAV 


6367 


287 


1934 


S I GFPVMLVLS I LLYTCEMFQDSVAFEDVAVS FTQEEWALLDPS 
QKNLYRDVMQETFKNLTSVGKTWKVQNIEDEYKNPRRNLSLMRE 
KLCESKESHHCGESFNQIADDMLNRKTLPGITPCESSVCGEVGT 
GHSSIiNTHIRADTGHKSSBYQEYGENPYRNKECKKAFS YLDS PQ 
SHDKACTKEKPYDGKECTETFISHSCIQRHRVMHSGDGPYKCKF 
CGKAFYFLMLCLIHERIHTGVKPYKCKQCGXAFTRSTTLPVHER 
THIX3VNADECKECGNAFSFPSEIRRHKRSHTGEKPYECKQCGKV 
FISFS S IQYHKMTHTGEKPYE CKQCGKAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEKTKTEDKPYGCKQCGKGFRCA 
SQLQIHERTHSGEKPHECKECGKVFKYFSSLRIHERTHTGEKPH 
ECKQCGKAFR YFS S LHIHERTHTGDK P YE CXVCG KAFTCSSS IR 
YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAFIRASSCREHERTHTINR 


6368 


1 


327 


RPVPAKLNPRS WFRTAGALPLRP PPLTMAVFHDEVE I EDFQYDE ' 
DSSTYFYPCPCX3DNFSITKEDLENGEDVATCPSCSI.I IKVTYDK 
DQFVCGET VPAPS AN KELVKC 


6369 


1 


1745 


AGCCRDTRF PTPRGPGS LCHNFCRSAACT VTRT IHG S PREDTGT^ 

PRS REMMFQDSVAFE DVAVS FTQEEWALLDPSQKNIi YRDVMQET 

FKNLTSVGKTWKVQNIEDEYKNPRRNLSLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGHSSLNTHIRAD 

TGHKSSEYQEYGENPYRNKECKKAFSYLDSFQSHDKACTKBKPY 

DGKECTETFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNIjCL 

IHERIHTGVKPYKCKQCGKAFTRSTTLPVHERTHTGVNADECKE 

CGNAFS FPS E IRRHKRS HTGE KP YE CKQ OGKVFI SFS S IQ YHKM 

THTGEKP YECKQCGKAFROGSHLQKHGRTHTGEKPYECRQ CGKA 

FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQU3IHERTHSG 

EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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(A=Alanine, C=Cysteine, DsAspartic Acid, Bo 
Glutamic Acid, ?=Phenyl alanine, G«Glycine, 
H=Histidine, I«Isoleucine, K«Lysine, 
LsLeucine, M«Methionine, N=Asparagine, 
P»Proline, Q»Olutamine, R«Arginine, 
S=Serine, ^Threonine, V-Valine, 
WoTryptophdn, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSLHIHERTHTGDKPYECKVCQKAPTCSSSIRYHElRTHTGBKPY ~ 
ECKHCGKA F ISN YI RYHBRTHTG EKP YQCKQCGKAF I RAS S CRE 
HERTHTINR 


£370 " 


1711 


3 25 


FVLSEQRLRTERTWPRS PGICRGAAAAGARTAG AG LLRXiL LGCG 
ALVGGLR P VTMTTP AMAQNAS KT WELSLYELHRTPQEAI MDGTE 
IAVSPRSLHSELMCPICLDKLKNTMTTKECLHRFCSDCIVTALR 
SGNKECPTCRKEQjVSKRSliRPDPNFDALISKrYPSREEYEAHQD 
R VL I RLS RLHNQQALS S S IEEGLRMQAMHRAQR VRR P I PGSDQT 
TTMSGGBGEPGEGEGDGEDVS SDSAPDSAPG PAP KR PRGGGAGG 
SSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPSPPGAPS 
PPBPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKY 
LALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGG 
DGPEEPALPSL3GVSEKQYTIYIAPGGGAFTTLNGSLTLELVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


288 


GVA1WSTAMNFGTKSFQPRPPDKGSFPLDKLGECKSFKEK?MKC'" 

LHNNNFENALCRKESKEYLECRMERiOjMLQEPLEKLGFGDLTSG 

KSEAXK 


6372 


2141 


625 


RVSAIASEGKAEISR YKKLEDUjEKSFS LVKMPSLQPWMCVMKH 
L PKVPEKKL KLVMAD KELYRACAVEVRRQ I WQDNQALFGDEVS P 
LWCQYILEKESALFSTEIiSVlJlNFFSPSPKTRRQGEVVQRLTRM 
VGKNVKLYDMVLQFLRTLFLRTRNVHYCTLRAELLMSLHDLDVG 
E I CTVDPCHKFTWCLDACI RERPVBS KRARB LQGFLDGVKKGQE 
QVLGDLSMILCDPFAfNTIALSTVRHLQELVGQETLPRDSPDLL 
LLLRLLALGQGAMDKIDSQVFKBPKME VELITRFLPMLMS FLVD 
DYTFNVDQKLPAEEKAPVSYPNTLPESFTKFLQEQRMACEVGLY 
YVLKITKQRNKNALLRLLPGLVETFGDLAFGDIFLHLLTGNLAL 
IJU^B FALED FCS SLFDGFFLTASPRKEJWHRHALRIjLIHLHPR V 
APSKLEALQKALEPTGQSGEAVKELYSQLGEKLEQLDHRKPSPA 
QAAETPALELPLPSVPAPAPL 


63 73 


67 


711 


PSRAARAS PARLPAMVS W 1 1 S RLWL I FGTL YPAYYS YKAVKS £C! 
DI KEYVKWMMYW 1 1 FAL FTTAETFTD I FLCWFP FYYEUCI AFVA 
WLLSPYTKGSSLLYRKFVHPTLSSKEKEIDDCLVQAKDRSYDAL 
VHFGKRGLNVAATAAVMAAS KGQGALSERLRS FSMQDLTT XRGD 
GAPAPSGPPPPGSGRASGXHGQPKMSRSASESASSSGTA 


6374 




2105 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
CPTHT FCNYTS STI FLSSTRDHS C PTHTSCNYTSSTI FLSSTRD 
HS CPTHTSCNYTSS TI FLS S TRDHS CPTHTFCNYPRP I IRLS S C 
CPAELQTEGSNGKKEVLSGFQVVLEDTVLFPEGGGQPDDRGTIN 
DIS VLRVTRRGEQADHFTQTPLDPGSQVLVRVDWERR FDHMQQH 
S GQRL I TAVADHLFKLKTTS WE LGRFR5 Al EIiDTPSMTAEQVAA 
I EQSVNEKI RDRL P VNVR BLSLDDPEVEQVSGRGLPDDHAGP IR 
WN1 BGVDSNMCCGTHVSNLSDDQVI KILGTEKGKKNRTNLI FL 
SGNR VLKWMERS HGrEKALTALLKCGAEDHVEAVKKTiQNSTK I L 
QKNNLNLLRELAVH IAHSLRNS PDWGGW 1 LHRKEGDSEFMN I 1 
ANEIGSE3TLLFLTVGDBKGGGLFLLAGPPASVETLGPRVABVL 
EGKGAGKKGRFQGXATKMSRRMEAQALLQDYISTQSAKS 


6375 


1 


1*35 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGBGGLAAWSRT 
CPGRPRRPGQQWRGPTMIiVTAYLAFVGLLASCLGIiELSRCRAK 
PPGRACSNPSFLRFQLDFYQVYFLALAADWLQAPYLYKLYQHYY 
FLEGQIAI LYVCGLASTVLFGLVASS LVDWLGRKNS CVLFSLT Y 
SLCCLTKLSQDYFV1LVGRALGGLSTALLFSAFEAWYIHEHVER 
HD FPAEWI PATFARAAFWNHVIAWAGVAAEAVASWIGLGPVAP 
FVAAI PLLALAGAIJ^RNWGEN YDRQRAFSRTCAGGLRCUjS DR 
RVLIXGTXQALFESVIFIFVFLWTPVLDPHGAPLGIIFSSFMAA 
S LI/3S SLYRI ATSKR YHIiQPMHLLS LAVL IWFSLFMLTFSTS P 
GQESPVESFIAFIiLIELACGLYFPSMSFLRRKVIPETEQAGVLN 
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Glutamic Acid, F« Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SaSerine, TaThreonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X»Uaknown, *»stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 








WFRVPLHSI^CLGLLVLHDSDRKTGTRNMFSI^ 
VGLFTVVRHDAELRVPSPTEEPYAPEL 


6376 


380 


1437 


ISSTDIDHYRFSFLVNSKMPSKESWS^GRKTNRftAVHKSKQBGRQ 
QDLLIAALGMKLGSPKSSVTIWQPLKLFAYSQLTSLVRRATLKB 
NEQIPKYEKIHNPKVHTFRGPHWCBYCANFMWGLIAQGVKCADC 
GLNVHKQ CS KMVPNDCKPDLKH VKKVYS CDLTTLVKAHTTKR PM 
WDMCI RE IESRGLNSEGLYRVSGFSDLIEDVKMAFDRDGEKAD 
ISVNMYEDINIITGALKLYFRDLPIPLITYDAYPKPIESAKIMD 
PDEQLETLHEALKLLPPAHCETI^YLMAHLKRVTLHHKENLMNA 
ENLGIVFGPTI>1RSPELDAMAALNDIRYQRLVVELLIKNEDILF 


6377 


2311 


1845 


SRIRRRSSRRPREPPGPSRRRRRRRPDPRTMPSEKTFKQRRTFE ' 
QRVEDVRLI REQHPTKIPVI I ERYKGEKQLPVLDKTKFLVPDHV 
NMSELIKIIRRRLQLNANQAFFLLVNGHSMVSVSTPISEVYESE 
KDEDG FL YMVYAS QBTFGMKLSV 


6378 


606 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNMAVS^ 
DLAblPDVDIDSDGVFJCYVLIRVHSAPRSGAPAAESKEIVRGYK 
WAS YHAD I YDKVSGDMQKQG CDCE C LGGG R I SHQSQDKK I HV YG 
YS MAYG PAQHAI S TE K I KAKYPD YEVTWANDGY 


6379 


35 


378 


eragspspsraalrrcapqrsqaprwpdraacrrsfqgsqgraV 

L FNS WNVG CG P AEE R VLLTGLHAVAD I Y CENG KTTLG W JCYK HA 

fessqkykegkyi ieiahmikdngwd 


• 6380 


1414 


462 

• 


pavqgqrgagp^rgsgnmarf"aLtv\^getrfnkekii<^~- 

gvdeplsetgfkqaaaagifxjsnvkftoafssdlmrtkqtmhgi 

lerskpckdmtvkydsrlrerkygwegkalselramakaareb 

CPVFTPPGG ETLDQ VKMRGI DPFEFI»CQLI LKEADQ KEQ FSQGS 

psncletslaeifplgknhsskvnsdsgipglaasvlvvshgay 
mrslfbyfltdlkcslpatjcsrselmsvtpntgmslfiinfbeg 

REVKPTVQC I CMNLQDHLNGLTENS LGLNLPS KSNHFEPLKGVP 
LALFTSLLC 


6381 


1668 


218 


awraqgsrgfsgagwrprqaaamnfsevfklssllckfspdgk 

YTiASCVQYRLVVRDVNTLQILQLYTCLDQIQHIEMSADSLFILC 

amykrglvqvwsleqpewhckidegsaglvascmspdgrhilnt 
tephlritvwslctksvsyikypkaclqgitftrdgrymalaer 
rdckdyvsifvcsdwqllrhfdtdtqdltgiewapngcvlavwd 
tcleykillysldgrllstysaybwslgiksvawspssqflavg 
s ydgkvrilnhvtwkmitefghpaaind pki wy keaekspqlg 
lgclsfpppjiagagplpsseskyeiasvpvslqtijcpvtdramp 
kigigmlafspdsyftatrmdnipnavwvwdiqklrlfavlbql 
spvrafqwdpqqprlaictggsrlyxwspagcmsvqvpgegdfa 
vlslcwhlsgdsmallskdhfclcfleteawgtacrqlgght 


63B2 


2 


1062 


FEKDBDRWLCLIAYPLKGDHGIVDIVDNSDCEPKSKLLRWTTNk 

khhvletektpkdwvrqhrkeekmkshkleeefewlkksrvlyy 

TVEKKGNISSQLKHYNPWSMKCHQQQLQRMKENAKHRNQYKFIL 

lenltsryevpcvldlkmgtrqhgddaseekaanqirkcqqsts 

AVI GVR VCGMQ VYQAGSGQLM FMNKYHGRKLSVQGF KEAL FQFF 
nwv*« z iiKKBULfeP vJjKJUjTELKAVIiERQESYRFYSSSIjLVI YDG 
KERPEWLDSDAEDLEDLSEESADESAGAYAYKPIGASSVDVRM 
IDFAHTTCRLYGEDTWHEGQDAGYI FGLQSLID IVTEISEESG 
E 


6383 " 


3159 


1061 ' 

• 


S PAPGRPS PHGSQPAARAAAAPAMPSAKQRGSKGGHGAASPSEK 
GAHPS AAR PLAAPTPAAPACRS PS PGGAPAS FPGRAPRSLAS Q P 
AARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVAKK 
P P PAPQQ P P P p PAPH PQQH PQQHPQNQAHGKGGHRGGGGGGGKS 
SSSSSASAAAAAAAASSSASCSRPXGRALNFLFYLALVAAAAFS 
3WCVHHVLEEVQQVRRSHQDFSRQREELGQGLQGVBQKVQSLQA 
rFGTFES ILRS SQHKQDLTE KAVKQGBSEVS R ISE VLQKLQNE I 
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location 
corresponding 
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amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" " 
<A=*Alanine, C-Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SaSerine, ^Threonine, V- Valine, 
N=Tryptophan, Y=Tyrosine, X=-Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








LKDLSDG I HWKDARERDFTSLENTVEERLTELTKS INDN I AI F 
TEVQKRSQKBINDMKAXVASLEESEGNKQDLKALKEAVKEIQTS 
AKSRE WDMEALRS TLQTMESD I YTEVRELVS LKQEQQAFKEAAD 
TE RLALQALTEKI*LRSEES VSRLPEE I RRLEEELRQLKS DSHG P 
KEDGGFRHSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 
ESLESLLSKSQEHEQRLAALG^RLEQIjGSSEADQDGIiASTVRSL 
G ETQLVLYGD VE ELKRS VG ELP S TVESLO KVO kdvwtt .t . qnnn a 

QAARLPPQDFLDRLSSLDNLKASVSQVBADIjKMLRTAVDSLVAY 
SVKIETNENNI/ESAKGLLDDLRNDLDRLFVKVEKIHEKV 


6384 


73 8 


1904 


IWEVPVCLTHLLHLQQANQPLPPPSSSINEEDADEANRAIGEKR 
AAPDSGKKPKTPKTKQQKDPNBPQKPVSAYALPPRDTQAAIKGQ 
N PNATFGEVSQI VASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVS KAAAESAEAQTIRSVQQTIiASTNLTSSLLLNTPLSQ 
HGTVSAS PQTLQCSLPRSIAPKPLTMRLPMNQIVTSVTIAANMP 
SNIGAPLISSMGTTMVGSAPSTQVSPSVQTQQHQMQLQQQQQQQ 
QQQMQQMQQQQLQQHQMHQQIQQQMQQQHFQHHMQQHLQQQQQH 
LQQQINQQQLQQQLQQRLQLQQLQHMQHQSQPSPRQHSPVASQI 
TSPIPAIGSPQPASQQHQSQIQSQTQTQVISQVSIP 


6385 


2 


1584 


PRVRAADVAAGA(lAVVfiAfSMatf QMflRMr2DT>R'D jv ar*c»eT e>r^^ni>V> 
* *^ * w ** *** v^*#iajj^\^\v v onunnJ\oci wOWv^JriCAlrAALzfcSiXjSGTRES 

LAQGPDAATTDELSSLGSDSEANGFAERRJDKPGFIVGSOGAEG 
ALEEVPLEVLRQRESKWLDMLNNV/DKWMAKKHKKIRLRCQKCilP 
PSLRGRAWQYLSGGKVKLQQNPGKFDEIJDMSPGDPKWLDVIERD 
LHRQ7PFHEMFVSRGGHGQQDLFRVLKAYTLYRPEEGYCQAQAP 
IAAVLLtMHM PAEQAFWCLVO ICEKYLPGYV5 P VT ^p-n tot nrcTT 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RVWDM FFCEGVKI I FRVGLVLLKHALGS PBKVKACQGQ Y ETIER 
LRSLS PKIMQEAFLVQEWE tiPVTERQIEREHL I QLRRWQETRG 
ELQCRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLDAPLPGS 
KAKPKPPKQAQKEQRKQMKGRGQLEKPPAPNQAMWAAAGDACP 
PQHVPPKDSAPKDSAPQDIiAPQVSAHHRSQESLTSQESEDTYL 


6386 


819 


195 


TVCGS F YLG IMQRASRLKRELHMIiATEP PPGI TCWQDKDQMDDI* ' 
RAQILGGANTPYBKGVFKLEVIIPERYPFEPPQIRFLTPIYHPN 
I DSAGR I CLD VLKLP PRGAWRPSLNI ATVLTS IQLLMSE PNPDD 
PLMADI SSEFKYNKPAFLKNARQWTEKHARQKQKADEEEMLDNL 
PEAGDSRVHNSTQKRKASQLVGIEKKFHPDV 


6387 


1 


662 


PGPTHAS ADAWADAWAQ PNMAMHNKAAPPQI PDTRRELAELVIO* 
KQELAETLANLERQIYAFEGS VTLEDTQMYGNI IRGWDRYLTNQK 
NSWS KNDRRNR KFKEAERLFSKSS VTS AAAVSALAGVQDQL I EK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSG S HHSSH KKRKNKNRHS PSGMFD YDFE IDLKLNKKPRADY 


6388 


1 


662 


PGPTHAS ADAWADAWAQPNMAMHNKAAPPQIPDTRRELAELVKR " 

KQELAETLANLERQIYAFEGSYLEDTQMYGNIIRGWDRYLTNQK 

NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 

REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 

STSSGSHHSSHKKRKNramHSPSGMFDYDPEIDI,KLNKKPRADY 


6389 


1074 


497 


AEPGDRMAGHRLVLVLGDLH I PHRCNSLPAXFKKLIjVPGKIQHI ' ' 
LCTGNLCTKES YD YLKTLAGDVH I VRGDFDENLNY PEQKWT VG 
QFKIGLIHGHQVIPWGDMASIiALIiQRQFDVDILISGHTHKFEAF 
EHENKFYI NPGS ATGAYNALETN I IPSFVLMDIQASTWT YVYQ 
LIGDDVKVERIEYKKP 


6390 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWQGSGSHG 
LTX AQRDDGVFVQEVTQNS PAARTGWKEGDQ I VG AT I YFDNLQ 
SGEVTQLLNTtiGHHTVGEjKLHRKGDRPFPSLGQTWDP | 


6391 


5386 


28$7 


vr™sktecyi,siqtqenfpanlnelvncivisslvttqrkli<a 

MSIiI/3SRNQLARAVIjNPNPMDFCTKDLLTTTSERI1AY1iRDFNE 
DQKKAI ETAYAMVKHS PS VAKI CIjIHGPPGTGKSKTI VGLLYRL 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D«Aspartic Acid, E* 
Glutamic Acid, ^Phenylalanine , Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
l>=Leucine, M=*Methionine, NsAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W*Tryptophan, Y^Tyrosine, X -Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6392 






LTENQRKGHSDBNSNAKIKQNRVLVCAPSNAAVDELMKKliLEF 
KEKCKDKKNPLGNCGDINLVRLGPEK5INSEVLKFSLDSQVNHR 
MKKELPSHVQAMH KR KEFLD YQLDELS RQRALCRGGRE IQRQEIj 
DENIS KVSKERQEIiASKIKEVQGRPQKTQS I IILESHIICCTLS 
TSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEIETLTPLIHRCW 

klilvgdpkqlpptvismkaqeygydqsmmarfcrlleenvbhn 

M ISRLP ILQLTVQYRMHPD I CLFPSNYVYNRNLKTNRQTEAIRC 

ssdwpfqpylvfdvgdgsbrrdndsyinvqeiklvmeiiklikd 

KRKDV3FRNIGIITHYKAQKTMIQKDLDKEFDRKGPAEVDTVDA 
FO/3RQKDCVIVTOniANSIQGSIGFLASI,QRLNVTITRAKYSLF 
ILGHLRTLMENQHWNQLIQDAQKRGAIIKTCDKNYRHDAVKILK 
LKP VLQR S LTHPPT IAPEGSRPQGGIjPS S KLDSG PAKTS VAAS L 
YHTPSDS KE ITLTVTS KDPERP P VHDQLQD PRLL KRMG I E VKGG 
IFLWDPQPSSPQHPGATPPTGEPGFPWHQDLSHVQQPAAWAA 
LSSHKPPVRGEPPAASPEASTCQSKCDDPEEELCHRREARAFSE 
GEQEKCGSETHHTRRNSRITOKRTLEQEDSSSKKRKLL 




| 972 


186 


grtgvdlassmahrlqirlltwdvkdtllrlrhplgbaVXtkar" 

AHGIjEVEPSALEQGFRQAYRAQSHSPPNYGLSHGLTSRQWPJLDV 
VLQTFHUAGVQDAQAVAPIAEQLYKDFSHPCTWQVLDGAEDTLR 

ecrtrglrlavisnfdrrlegilgolglrehfdfvltseaagwp 
kpdprifqealrlahmepwaahvgdnylcdyqgpravgmhsfl 

WGPQALDPWRDSVPKEHILPSIiAHLLPALDCLEGSTPGL 


S393 
6394 


201> 


730 


TGGS KMAAVATCGS VAASTGSAVATASKSNVTS FQRRGPRAS VT 
NDSGPRLVS I AGTRPS VRNGQLLVS TGLPALiDQLLGGGIAVGTV 
LLIEEDKYNIYSPLLFKYFLAEGIVNGHTLLVASAKEDPANILQ 
ELPAPLLDDKCKKEPDEDVYNHKTPESN1KMKIAWRYQLL D KME 
IGPVSSSRFGHYYDASKRMPOJELIEASNWHGFFLPEKISSTLKV 
EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN 
LGS PLWGDD I CCAENGGNSHSLTKFLYVLRGLLRTSLSACI ITM 
PTHLIQNKAI IARVTTLSDVWGLESFIGSERETNPLYKDYHGL 
IHIRQiPRLNNLICDESDVfCDLAFKLKRKIiFTIERliHLPPDLSD 
TVSRSSKMDLAESAKRLGPG03MMAGGKKHLDF 




1418 


511 


GAAAGGEGARRRPAAMATVMAA^AAERAVLggfeFRWLLHDEVHA" 
VLKQ LQD I LKEASLRFTLPGSGTEGPAKQENF ILGS CGTDQ VKG 
VLTU2GDALSQAI)VNLKMPRNNQLLHFAFREDKQWKLQQIQDAR 

nhvsqaiylltsrdqsyqfktgaevlklmdavmlqltrarnrlt 

TPATLTLPEIAASGLTRMFAPALPSDLLVNVYINLNKLCLTVYQ 
liHALQ PNSTKNFRPAGGAVIjHS PGAM FEWGSQRLE VSHVHKVEC 
VIPWLNDALVYFTVSrjQLCQQLKDKISVFSSYWSYRPF 


6395 


IF 


658 I 


PSGRPTRPLCCAARRQAARrifiGSVSGWPAGRTPTETSNPGSSVM 
ESVTFEDVAVEFIQEWALLDSARRSLCKYRMLDQCRTLASRGTP 
PCKPS CVSQLGQRAE PKATERGILRATGVAWESQLKPEELPSMQ 

dllbeassrdmqmgpglflrmqlvpsieeretpltrbdrpalob 
ppwslgctglkaamqiqrwi pvptlghrnpwvardsgs 


6396 
6397 


l 


1221 


ANILSSPSKKGQKGTIjIGYSPEGTPliYNFMGDAFQHSSQSIPiCF^" 
IKESLKQILEESDSRQIFYFLCLNLLFTFVELFYGVT.TMQT fi t 

SDGFHMLFDCSALVMGLFAALMSRWKATRIFSYGYGRIEILSGF 
INGLFLI VIAFFVFMES VARM DPPELDTHMLTPVSVGGLI VNL 
IGICAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GS AGGGMNANMRGVFLHVIADTLGS I G VI VST VLIEQ FGWFIAD 
PLCSLFIAIIiIFLS VVPLIKDACQVLLLRLPPEYEKELHIALEX 
IQKIEGLISYRDPHFWRHSASIVAGTIHIQVTSDVLEQRIVQQV 
TG ILKDAGVNNLTI Q VEKEAYFQHMSGLS TGFHDVLAMTKQMES 
MKYCKDGTYIM 




391 


122 


SAGGVGRFE AI RAPARMI E VVCNDRLGKKVRVKCKTDDT I GDLK 
KL IAAQTGTRWNKI VLKKWYT I FKDHVSLGDYE IHDGMNLELYY 
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(A-Alanine, C«Cysteine, D-Aspartic Acid, 
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Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 
Q 


6398 
6399 


353 - 


1306 


hkqmgplijnkckkillpttvppatmriwllggllppLlLLsglq 

RPTEGSEVAIKIDPDFAPGSFDDQYQGCSKQVMEKLTQGDYFTK 
DIEAQKOTFRMWQKAHlJVWLNQGKVLPQNNmTHAVAlLFYTLN 
SNVHSDFTRAMASVARTPQQYERSFHFKYLHYYLTSAIQLLRKD 
SIMENGTLCYBVHYRTKDVHFNAYTGATIRFGQFIjSTSLLKSEA 
QKFGNQTLFTlFTCLOAPVQYPSLKKEVLIPPYEriFKVINMSYH 

prgdwlqlrstgnlstyncqllkasskkcipdpiaiaslsflts 
viifsksrv 


6400 


75 


1245 


PNbET YFGR RCjE KDSMN FT PTHTP VCRKRT\A/'S KRG VAVS GPTK 

rrgmadslestplpspedrlaklhpskelleyyqkkmaecbaen 

EDLLKKLELYKBACEGQHKLECDLQQREEBIAELQKALSDMQVC 
LFQEREHVLRLYSBNDRIiRrREIiEDKKKlQNiaiALVGTDAGEVT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDIQTLILQVEALQAQLGEQTKLSREQIEGLIED 
RRIHLEE I QVQHQRNQNKI KELTKNLHHTQELLYESTKDFLQLR 
SENQNKEKSWMLEKDNLMSK I KQYRVQCKKKEDKIGKVLPVMHB 
SHHAQS E Y I KVMS LCRNE WYFSGRVEG I PKNL0FVM 


6401 


2520 


j 1053 


K.TM KCDE WV E VQS A I LRHNCGYAMKTG KFFHNLMER KDPETWL 
DNI SVTFLSLTDLQ KNETLDIIL IS IiSGAVQLRHLSNNLE TLL KR 
DFLKLLPLEI^FYLLKWLDPQTLLTCCLVSKQWNKVTSACTEVW 
QTACKNLGWQlDDSVQDALHWfC^VYLKAILRMKQLEDHEAFETS 
SLIGHSARVYALYYKDGLLCTGSDDLSAKLWDVSTGQCVYGI QT 
HTCAAVKFDEQKLVTGS FDNT VACWE WS SGARTQHFRGHTGAVF 
SVDYNDELDILVSGSADFTVKVWALSAGTCLNTLTGHTEWVTKV 
VLQKCKVKS LLHS PGD YILLSADKYE I KI WP IGRE I N CKCLKTIj 

SVSEDRSICLQPRLHFDGKYIVCSSALGLYQWDFASYDILRVIK 
TPEIANliALLGFGD I FALLFDNRYLy IMDLRXESLISRWPLPEY 
RKS KRGSS FLAGEAS WLNGLEGHNDTGLVFATSMPDHS I HLVLW 
KEHG 


" 6402 


109 


756 


pgaawsrpdlrgcctgpqpalrmlvlpspcpqplafssvetMeg 

PPRRTCRSPEPGPSSSIGSPC^SPPRPNHYLLIDTQGVPYTVI, 
VDEESQRE PGASGAPGQKKCYS CPVCSRVFEYMS YLQRHSITHS 
EVKPFECDICGKAFKRASHLARHHSIHLAGGGRPHGCPLCPRRF 
RDAGELAQHSRVHSGERPFQCPHCPRRFMEQNTLQKHTRWKHP 


6403 


TI96 r 


279 


TTSQCGGIRQSSAIPVASMEFAAICLRNALLLLPESQQDPKQEN 
GAKNSNQLGGNTESSBSSETCSSKSHDGDKFIPAPPSSPLRKQE 
LENIjKCSIIACSAYVALALGDNLMALNHADKLLQQPKLSGSLKF 
LGHLYAAEALISLDRISDAITHLNPENVTDVSLGISSNEQDQGS 
DKGENEATffiSSGKRAPQCYPSSVKSARTVWLFNLGSAYCLRSEY 
DKARKCLHQAASMIHPKEVPPEAILLAVYLELQNGNTQLALQI I 
KRNQLLPAVKTHSBVRKKPVFQPVHPIQPIQMPAFTTVQRK 


6404 


2 

ioii j 


1690 

1 

J 

222 i 


KGIHTSVLQGNLQNQMYSHNVVIMNLNNLNLTQVQQRNLITNLQ 
RS VDDTSQAIQRIKNDFONLQQVFLQAKKDTDWLKEKVQS LQTL 
AANNSALAKANNDTLEBMNSQLNS FTGQMENITTISQANEQNLK 
DLQDLHKDAENRTA I KFNQLEERFQLFETD I VNI ISNT <5 YT* vu 
LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTIiANIRLDSVSIjR 
MQQDLMRSRLDTEVANLSVIMEEMKLVDSKHGQLIKNFTILQGP 
PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGBRGPIG 
PAG PPGERGGKGSKGSQGPKGSRGS PGKPGPQGPSGDPGP PGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP 
KGPPGPPGPSGAWPLALQNEPTPAPEDNSCPPHWXNFTDKCYY 
FSVEKEIFEDAKLFCEDKS SHLVFINTREEQQWI KKQMVGRESH 
^IGLTDSERBNEWKWLDGTSPDYKNVJKAGQPDNWGHGHGPGEDC 
^G1»I YAGQWNDFQCEDVNNFr CEKDRETVLSS AL 
^AAIiAMAAPAPGI>I S VFS S SQEI&AAIjAQ LVAQRAACCLAGARA~ 



499 



WO 01/53312 



PCT/US00/34263 



[ SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, OCysteine, D=Aspartic Acid, E= 
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R FALGLS GGS LVSMLARELPAAVAPAGP AS LARWTIiGPCDERLV 
P PDHAES T YGLYRTHLLSRLP I PESQV I T I NPELPVB EAAED YA 
KKLRQAFQGDS I PVFDLLILGVGPDGHTCSLFPDHPLLQEREKI 
VAP IS DS PKP PPQRVTLTLPVLNAARTV I F VATGEGKAAVL KR I 
LEDQEBNPLPAALVQPHTGKLCWFLDRAAARLLTVPPBKHSPL | 


6405 


1 


1456 


AALPRPTPRAPLGREGTGSDSEriAASMFYGRLVAVATLRNHRPR - "" 
TAQRAAAQVLGSSGLFITOHGLQVQQQQQRNLSLHEYMSMELLQE 
AGVSVPKGYVAKSPDEAYAIAKKLGSKDWIKAQVLAGGRGKGT 
FESGLKGGVKIVFSPEEAKAVSSQMIGKKLPTKQTGEKGRICNQ 
VLVCERKYPRREYY PAITME RS FQGPVLI GSSHGGVN IED VAAE 
TPEAIIKEPIDIEBGIKKEQALQLAQKMGPPPNIVESAAENMVK 
LYSLFLKYDATMIBINPMVEDSDGAVLCMDAKINPDSNSAYRQK 
K I FDLQDWTQEDE RD KDAAKANLMY IGLDGNIGCLVNGAGLAMA 
TMDI I KLHGGTPANFLDVGGGATVHQVTEAFKLITSDKKVLAI L 
VNI FGGI MRCDVIAQGI VMAVKDLEI KI P VWRLCGTRVDDAKA 
L I ADSGLKI LACDDLDEAARMW KLSE I VTLAKQAH VD VKFQLP 


6406 


1036 


167 


HPRQMRGED'i'PEAPPYSSGRYDSIKTEVSGCPSOLTVGRAPTAD 
DDDDDHDDHEDNDKMNDS EGMD PERLKAFNMFVRLFVDENIiDRM 
VPISKQPKEKIQAI IESCSRQFPEFQERARKRIRTYLKSCRRMK 
KNGMEMTR PTP PHLTSAMAENI LAAACES ETRKAAKRMRLE I YQ 
SSQDEPIALDKQHSRDSAAITHSTYSLPASSYSQDPVYANGGLN 
YSYRGYGALSSNLQPPASLQTGNHSNGBSGEARALASRPAPSWV 
CRAALGSGMGRGKQRPVMERGCLTA 


6407 


492 


150 


VGLCIAVSQTVIiAQLDALLVFPGQVAQLSCTLSPQHVTIRDYGV 
SWYQQRAGSAPRYLLYYRSEEDHHRPADIPDRFSAAKDEAHNAC 
VLTI S PVQPEDDADYYC3 VG YGFS P 


6408 


1458 


903 


RGCI TSSQAWRLFGG VTRGFNMRI EKCYFCSGPIYPGHGMMPVR 
NDCKVFRFCKS KCHKN FKKKRNPR KVRWTKAFRKAAGKELT VDN 
S FE FE KRRNEP I KYQR ELWNXT I DAMKR VBB IKQKRQAKF IMNR 

LKKNKELQFCVQDIKEVKQNIHLIRAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


ISO 


446 


NTALANULRCFTCDRLCGGCTAPAPPAHQGIVLQPVMPSCDPGP 
GPACLPTKTFRSYIiPRCHRT YSCVHCRAHLAKHDELISKS FQGS 
HGRAYLFNSV 


6410 


65 


607 


RGGTAGCVACliGCWGQS S SPKAAF PAGSACLPADSCPCLLFQAC " 
AISGLFMCITIHPLNIAAGVWMIMNAFILLLCBAPFCCQFIEFA 
NTVAEKVDRLRS WQKAVP YCGMAWPI VI SLTLTTLLGNAI AFA 
TGVLYGLSALGKKGDAISYARIQQQRQQADEEKLAETLEGEL 


6411" 


302 


772 


PXSIMASSI^EDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAG I AVLFKKKFGGVQELliWQQKKSGE VAVLKRDGR YI YYLI T K 
KRASHKPTYEI^QKSLEAMKSHCXJQTGVTDLSMPRIGCGLDRLQ 
WENVS AMI EEVFEATDI KI T VYTL 


6412 


61 


1709 

... 


RPVTSPSPLPGSCGGRLGTRTMLGRSLREVSAALKQGQITPTEL 
CQKCLSLIKiCTKFIJ^AYITVSEEVALKQAEESEKRYKNGQSLGD 
LDGI P I AVKDNFS TSG IETTCASNML KGYI PPYNATWQKLLDQ 
GALLMGKTWLDEFAMGSGSTDGVFGPVKNPWSY5KQYREKRKQN 
PHS ENEDS DWL I TGGSSGGSAAAVS AFTC YAALGSDTGGSTRNP 
AAHCGLVGFKPSYGLVSRHGIilPLVNSMDVPGILTRCVDDAAlV 
LGAIAGPDPRDSTTVHEPINKPFMLPSLADVSFCLCIGIPKEYLV 
PELSS E VQSLWS KAADLFE S EGAKVI E VSLPHTS YS I VC YHVLC 
TSE VASNMARFDGLQ YGHRCDI D VSTEAMYAATRREGFND WRG 
RILSGNFFLLKENYENYFVKAQKVRRLIANDFVNAFNSGVDVLL 
TPTTLSEAVPYLE F I KEDNRTRSAQDDI FTQ AVNMAGtiPAVS I P 
VAIiSNQGLPIGLQFIGRAFCDQQLLTVAKWFEKQVOFPVIQLOE 
LMDDCSAVLENEKLASVSLKQ 
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6413 


2 


685 ' 


HEPRCAGMAASLWMGDLEPYMDENFI SRAFATMGET VMS VKI IR 
NRLTG I PAGYCFVEFADLATAE KCLHKINGKPLPGATPAKRFKL 
NYATYGKQPDNSPEYSLFVGDLTPDVDDGMLYEFFVKVYPSCRG 
G KWLDQTGVS KGYGFVKFTDELEQKRALTECQGAVGLGS KPVR 
LSVAI P KAS RVKPVE YSQM YS YS YNQY YQQYQNY YAQWG YDQNT 
GSYSYSYPQYGYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQSEELYDALMDCHWQPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPRPQPARPSSRATPGPRSPGMATSIGV" 
SFSVGDGVPEAEKNAGEPENTYILRPVFQQRFRPSWKDCIHAV 
LKE ELANAE YS PERM POLTKHLS KM T Knif r . WMn t>m> mn nrrwr 

VIGEQRGEGVFMASRCFWD^U)TDNYTHDWMNDStiFCVVAAFGC 
FYY 


641S 


2 


1168 


FVRQMQS SHRRACGLGCEARAGGGEEPRGRASS VAGWVGAFRAP 
FIEAAVAGLGAGSGKRRRGWKMPVHSRGDKKETNHHDEMEVDYA 
ENEGSSSEDEDTRSSSVSEDGDSSEMDDEDCERRRMECLDEMSN 
LE KQ FTD L KDQLYKERLSQ VDAKLQEVI AGKAP EYLEPLATLQE 
NMQ I RTKVAG I YRELCLESVKNKYECEI QASRQHCES E KLLL Y D 
TVQSELEEKIRRLEEDRHSID1TSELWNDELQSRKKRKDPFWPD 

i^^xjvcw viavarjri a V Xl w LU^LfLiU 1 LiZtUrfl 1 J.J\JvAMAl liGFHRVKTEP 

P VKLEKHLHS ARSEEGRliYYDGEW Y I RGQTI C I DKKDECPTS AV 
ITTINHDEVWFKRPDGSKSKLYISQLQKGKYSIKHS 


6416 


410 


1S19 


EIAPADIiEIPACAPVLLSRATSSTMSVTGGltMAPSLTQEILSHL 
GLAS KTAAWGTLGTLRT FLNFS VDXDAQRLLRAITGQGVDRSA I 
VDVLTNRSREQRQLl SRNFQERTQQDLKKSLQAALSGNLERI VM 
ALLQPTAQFDAQELRTALKASDSAVDVAIE I LATRTP PQLQECL 
AVYKHNFOVE AVDG I TS ETQR TTiOnT.T.T . At. n v««t> n c v cr»r t n v 

NLAEQDVQALQRAEGPSREETWVPVFTQRNPEHLIRVFDQYQRS 
TGQELEEAVQNRFHGDAQVALLGLASVIXNTPLYFADKLHQALQ 
ETEPNYQVLIRILISRCETDLLSIRAEFRKKFGKSLYSSLQDAV 
KGDCOSALIiALCRAEDW 


6417 


1 


845 


RGESRVLWSELEGEAGGAGGWASSLWAfeMDNRFATAPVTAfn/T q 
LI STI YMAAS IGTDF WYE YRSP VQENSSDLNKS I WDEFI SDEAD 
EKTYHD AL PRYNGTVGLWRRCIT I PKNMHW YS P PERTES FDWT 
KCVSFTLTEQFMEKFVDPGNHNS GIDLLRTYLWRCQFLLPFVSL | 
GLMCFGAilGLCACICRSLYPTIATGILHLLAGLCTLGSVSCYV J 

AG I ELLHQXLELPDNVSGEFGWS fclacvsaplqfmas alf i wa 

AHTNRKEYTLMKAYRVA 


6418 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTSTPAGASPAAAYQADPPPPAH 
TPAPPPPPP CGGI ACHGEPAKFYGYDNLQRQP I FTTQQEAELVQ 
YPDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYS RFQTIjEIiE KEFLFNP YLTRKRRI E VSHAUUiTERQ VKI WFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETKKEAOEIjEEDRAEGLTN 


6419 


1 


973 


PGRPRVRNFDLNSKSILQEFFCTRSIQIPANRSKTAMSKCPIFP 
MARS ISTSGPLDKEDTGRQKLISTGSLPATLQGATDSLGLE whl 
PSPDPVTVPYLSPLWWKELESLLENEGDHAITVADFVDHHPIV 
FV^LVWYFRRLDLPSNLPGLrLSSBHCNKYSKIPRHCMSEDSKY 
VLI QMLWDNMKLHQDPGQPLY ILWNAHTQKYPMVHLLQKSDNS F 
NQE LL KSMVK3 1 KMNDVYG PMSQILETLMKCPHFKRQRS L YREI 
LFLSLVALGRENIDIDAFDKEYKMAYDRLTPSQVKSTHNCDRPP 
STGVMECRKTFGEPYL 


6420 


207 


1187 


RKM I DKNQTCGVGQDS VP YM I CL IK I LEE WFG VEQt>ED YLNFAN 
YLLWVFTPLILLILPYFTIFLLYLTIIFLHIYKRKNVLBCEAYSH 
NLWDGARKTVATLWDGHAAVWHGYEVHGMEKIPEDGPALI IFYH 
GAI P IDF YYFMAKI FIHKGRTCRWADHFV FKI PG FSL LLDVFC 
ALHG PRE KCVEI L RSGHLLA I S PGGVREAL ISDET YNI VWGHRR 
GFAQVAIDAKVP 1 1 PMFTQNIREGFRSLGGTRLFRWLYEKFRYP 
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FAP MY GG F P VKLRT YLGD P I P YD PQ ITABELAE KTKNAv'6aL ^ D 
KHQRIPGNIMSALLERFH 


6421 


■ 1844 


362 


WALSIJU«QPERMSKfkL£^PHPHSVVLRSEFK7yrASSPavT pacpr — 
YQWSLKSSAQFLGS PQLRQVGQI IRVPARMAATLILEPAGRCCW 
DE PVR I AVRGLAP EQ P VTL RAS LRDE KG ALFQ AHAR YRADTLG E 
LDLERAPALGGSFAGLEPMGLLWALEPEKPLVRLVKRDVRTPLA 
VELEVLDGHDPDPGRLLCQTRHERYFIiPPGVRREPVRVGRVRGT 
LFLPPEPGPFPGIVDMFGTGGGLLEYRASLLAGKGFAVMALAYY 
NYEDLPKTME^TLHLEYFEEAMNYLLSHPEVKGPGVGLLGISKGG 
ELCLSMAS FLKG X TAAWING S VANVGGTLR YKGETL P PVG VNR 
KRIKVTKDGYADIVDVLNSPLEGPDQKSFIPVERAESTFIjFLVG 
QDDHNW KS E F YANEACKRLQAHGRR KPQI I C YPETGH YI EPP YF 
PLCRAS LHAIjVGS P 1 1 WGGE PRAHAMAQVDAWKQ LQT FFHKHLG 
GREGTTPSKV 


6422 


1B1 


2133 


egenlswfqefwgdiakefywktpcpgpflrynfdvTRgkifie - 

WMKGATTNI CYNVLDRNVHEKXLGDiCVAFYWEGNEPGETTQITY 
HQLLVQ VCQPSNVLRKQGI HKGDR VAt YMPMI P EL WAM LACAR 
iwnunoi v fflur c>£>.E»&lA.t£K JLJoL/S S CSI>L 1 1 i D AF YRG E KLVNL 
KEIJUDEMQKC^BKGFPVRCCIVVKHLGRAELGMGDSTSQSPPI 
KRS C PDVQ I S WNQG IDLWWHELMQE AGDBCE PEWCDAEDPLFI Ii 
YTSGSTGKPKGWHTVGGYMLYVArTFKYVFDFHAEDVFWCrAD 
I GWITGHS YVTYGPLAWGATSVLFEGI PTYPDVNRLWS I VDKYK 
VTKFYTAPTAIRLLMKFGnPPVTTfHQPaCTi^TTT r-nrnrnnmni. 

WLWYHRWGAQRCPIVDTFWQTETGGHMLTPLPGATPMKPGSAT 
FPFFGVAPAIIiNESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 
RFETTYFKKFPGYYVTGDGCQRDQDGYYWITGRIDDMLNVSGHL 
LSTAEVESALVEIIEAVAEAAWGHPHPVKGECLYCFVTLCDGHT 
P3PKLTEELKKQIREKIGPIATPDYI0NAPGLPKTRSGKIMRRV 
LRKIAQNDHDLGDMSTVADPSVISHIiPSHRCLTIQ 


6423 


614 


1237 


AT^KEIPRDLPPETVLLYLDSNQITSlPNEIFXDLHQlJiVI^NLS 
KNG I E F I DEHAFKG VAETLQTLDLS DNRIQS VHKNA FNNLKARA 
RlANNPWHCDCIljQOVIJlSMASiniETAMWTrVTQUT nrunroD 
FI^AANDADLO^PKKTTDYAMLVTMFGWPTMVISYVVYYVRQN 
QEDARRHtiEYLKSLPSRQKKADEPDDISTW 


6424 


1 


1188 


KKVSWPVAAMVHCSCVLFRKYGNFIDKLRLFTRGGSGGMGYPRL 
GGEGGKGGDVWVVAHNRMTLKQLKDR YPRKRFVAG VGANSKI SA 
LKGSKGKDWEI PVPVGIS VTDENGKI IGELNKENDRILVAQGGL 
GGKLLTWFLPLKGQKRI I HLDLKL IADVGLVGF PNAG KSS LL S C 
VSHAZPAIADYAFTTLKPELGKIMYSDFKQISVADLPGLIEGAH 
MNKGMGH KFLKH IERTRQ JLLFWDI S GFQLS SHTQYRTAFE TI I 
LLTKELELYKBELQTKPALLAVNKMDLPDAQDKFHBLMSQLQNP 
KDFLHIiFEKNM I PERT VE FQHI IPI S AVTGBG I EEfcKNC I R KS L 
DEQANQENDALHKKQLLNLWI S DTMSSTEPPSKHAVTTSKMD 1 1 


6425 


1850 


1144 


LAMEGGGGI ^LilTX'IjKEESQSRHVLPASFEVNSLQKSNWdFLL'tG"*" 
L VG GT LV A VYAVAT P FVTPAL RKVCL P FVPATM KQ I ENWKMLR 

fTJDfSGT VnTr'OPTVtl TUTU n n ti-tj-i-i run » 

uKKUaXi vu i L»faGDGRl V I AAAKKGPTAVG YELNP WLVWYSR YRA 
WREGVHGSAKFYI SDLWKVTFSQYSNWI FGVPQMMLQLBKKLE 
RELEDDAR VIACR FPFPHWTPDHVTGEGIDTVWA YDASTFRGRE 
KRPCTSMHPQLPIQA 


6426 


30 


565 


SRGAAVGGMSVAGGEIRGDTGGEDTAAPGRFSFSPEPTLEDIRR 
LHAE FAAERDWEQ FHQPRNLLLALVGE VGELABLFQ WKTDGEPG 
PQGWS PRERAALQ EELSDVL I YL VALAARCR VDLPLAVLSKMDI 
JRRRYPAHLARSSSRKYTELPHGAISEDQAVGPADl PCDSTGQT 


6427 


14S 


959 ; 


AASWGPPHVPKAGKMVSWMICRLWLVFGMLCPAYASYKAVKTK 
yiR£YWV7MMYWiyFALFP1AAEIVTDlFISWFPFYYEIKl^FVl, 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(^Alanine, OCysteine, D*Aspartic Acid*, E= 
Glutamic Acid, F= Phenyl alanine, G=»Glycine, 
H«Histidine, Ielsoleucine, K«Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P«Proline, Q=Glut amine, RsArginine, 
SsSerine, ToThreonine, V=»Valine, 
W=Tryptophan, YaTyrosine, X-Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








WLLSPYTKGASLLYRKFVHPSLSRHEKBIDAyiVQAKERSYETV 

LSFGKRGLNIAASAAVQAATKSQGALAGRLRSFSMQDLRSISDA 

PAPAYHDPLYLEDQVSKRRPPIGYRAGGLQDSDTEDBCWSDTEA 

VPRAPARPREKPLIRSQSLRWKRKPPVREGTSRSLKVRTRKKT 
VPSDVDS 


6428 


1982 


444 


SGSGGKMEDHQHVPIDIQTSKLLDWLVDRRHCSLKWQSLVLTIR~ 
EKINAAIQDMP ESEE I AQLLSGS YI HYFHCLR I LDLLKGTEAST 
KNI FGRYSSQRMKDWQE 2 1 ALYEKDNT YLVELSS LLVRNVNYE I 
PSLKKQIAKCQQLQQBYSRKEEECQAGAAEMREQFYHSCKQYGI 
TGENVRGE LLALVKDLPSQLAE I G AAAQQSLGEAI DVYQAS VGF 
VCBS PTEQVLPMLRPVQKRGNSTVYEWRTGTE PS WERPHLEEL 
PEQVAEDAI D WGO FG VEAVS EGTDSG I SAEAAG I DWG I FPESDS 
KDPGGDGIDWGDDAVALQITVIjEAGTQAPEGVARGPDALTtiLEY 
TETRNQFLDELMELElFLAQRAVELSEEADVLSVSQFQLAPAll, 
QGQTKE KMVTMVS VLEDL I G KLTSLQLQHLFMl LAS PRYVDRVT 
BFLQQKLKQSQIiLALKKBLMVQKQQEALEEQAAIiEPICLDLLLEK 
TKELQKLI EADI S KRYSGRPVNLMGTSL 


6423 


3413 


3442 


KeSSVrTAAPRGPLAAHPLEAAVQEDDRRALSFDSRIKVPANGTL 

WKSVTDKDAGDYLCVARNKVGDDYVVLKVDVVMKPAKIBHKEE 

NDHKVFYGGDLKVDCVATGLPNPE I S WS LPDGSLVNS FMQSDDS 

GGRTKR YWFNNGTLY FNEVGMREEGDYTCFAENQVGKDEMRVR 

VKWTAPATI RNKTCLAVQVP YGDVVTVACE AKG FPM D tottwt .q 

PTNKVIPTSSEKYOIYQDGTLLIQKAQRSDSGNYTCIiVRNSAGE " 

DRKTVWIHVNVQPPKINGNPNPITTVREIAAGGSRKLIDCKAEG 

IPTPRVLWAFPEGWLPAPYYGNRITVHGNGSLDIRSLRKSDSV 

QLVCMARNEGGEARLIVQLTVLEPMEKPIFHDPISEKITAMAGH 

TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRPYHKADGMLH 

rSGLSSVDAGAYRCVARNAAGHTERLVSLKVGLKPEANKQYHNL 

VSIINGETtiKLPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSlj 

LDNGTLTVREASVFDRGTYVCRMETEYGPSVTS1PVIVIAYPPR 

rTSEPTPVTYTRPGNTVKLNCMAMGlPKADITWELPDKSHLKAG 

VQARL YGNRFLHP QGSLT I QHATQRDAGFYKCMAKN IliGSDS KT 

TYIHVF 


6430 


1946 


602 


RTRVSTGLRRTLLWSEAVGASSTRGDTGIPGSGEGGAGPGGGEG""' 

AMLEAMAE PSPEDP? PTLKPETQPPEKRRRTI EDFNKFCS FVLA 

YAGYIPPSKEESDWPASGSSSPIiRGESAADSDGWDSAPSDLRTI 

OTFVKKAKSSKRRAAQAGPTQPGPPRSTFSRLQAPDSATLLEKM 

KLKDSLFDLDGPKVASPLSPTSLTHTSRPPAALTPVPLSQGDLS 

HPPRKKDRKNRKLGPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 

KRKLKKAERGDRLPPPGPPQAPPSDTDSEEEEfiEEEEEEEEEMA 

TWGGBAPVPVLPTPPEAPRPPATVHPEGVPPADSESKEVGSTE 

TSQDGDAS SSEGEMRVMDEDI MVESGDDSWDLITCYCRKPFAGR 

PMI ECS LCGTWIHLfl CAKIKKTNVPDFFYCQKCKELR ^EARRLG 

GPPKSGEP 


6431 


3 


605 


WWNSSYKIiPAYAPYI»PCEACAMQDGRKGGAYAGKMEATTAGVGT~ 

LEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 

RELRLRNYVPEDEDLKKRRVPQAKPVAVEEKVKEQLEAAKPBPV 

IEEVDLANLAPRKPDWDLKRDVAKKLEKLKKRTQRAIAELIRER 

LKGQEDSLASAVDAATEQKTCDSD 


6432 


56 


1632 


GGLGTMGSRIKQNPETTFEVYVEVAYPRTGGTLSDPEVQRQFPE 
DYSDQEVLQTLTKFCFPFYVDSLTVSQVGQNFTFVLTDIDSKQR 
FGFCRLSSGAXSCFCILSYLPWFEVFYKLLNILADYTTKRQENQ 
WNELLETLHKLP I PDPGVSVHLS VHS YFTVPDTREIiPS I PENRN 
LTE YPVAVDVNNMLHL YASMLiYBRR IL 1 1 CSKLSTLTACIHGSA 
AMLYPMYWQHVYIPVLPPHLLDYCCAPMPYLIGIHIiSIjMEKVRN 
MALDDWILNVDTNTLETPFDDLQSLPNDVISSLKNRLKKVSTT 
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SEQ 
ID 

NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
co first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
j nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6433 



"1*24 



Atnino acid segment containing signal peptide' 
<A»Alanine, C-Cysteine, D=Aspartic Acid, 
Glutamic Acid, F« Phenyl alanine, G^Glycine, 
H»Histidine, I = I sol eu cine, K*Lysine, 
L=Leucine, M»Me thionine , N=Asparagine, 
P=Proline, Q^Glut amine, R=*Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y-Ty*osine, X=Unknown, ♦-stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insert ion) 
TGDGVARAFiiKAQAAFFGS YRNALKI EPEKPITFCBBAFV3HYR 
SGAMRQFLQNATQLQLFKQFIDGRLDLLNSGEGFSDVFBEE1NM 
GBYAGS DKLYHQ WL^TVRKGSGAI LNTVKTKANPAMKTVYKFDI 
ABNGCAPTPEEQLPKTAPSPLVEAKDPKLRBDRRPITVHFGQVR 
PPRPHWKRPKSNIAVBGRRTSVPSPBQNTIATPATLHILQKSI 
TEFAAKFPTRGWTSSSH 

APVTKRKEWAKDSKOSAIiDA GRDPKRPALPETLCESGWASjrrA 
PTTPPQPGWCLCGKDFKSSCQTPGRBKERRLATMHGSCSFLMLL 
LPLLLLLVATTGP VGALTDBEK1U*MVE lilNLYRAQVS PTASDML 
HKRWDBELAAFAKAYARQCVWGHNFCBRGRRGENLFAITDEGMDV 
PI^EEWHHBREMYNLSAATCSPGQMOGHy'IQVVWAKTERIGCG 
SHFCBKLQGVEETWIELLVCNYEPPGNVKGKRPYQEGTPCSOCP 
SGYHCKMSLCE P I GSPEDAQDLP YLVTEAPS FRATEAS DSRKMG 
AEGPDKPSWSGLNSGPGHVWGPLLGLIjLLPPIiVLAGIF 



2002 



6435 



"^22T 



MPQLNFGMADPTQMGGLSMLIiLAGEHAIiGTPBVFSGTCRPDVSE" 
SPELRQKSPLFOFAEISSSTSHSDASTKQCQTSALFQEAEISSN 
; TSQLGGAEPVKRCGKSALFQIJ^MCLASEGMKMEESKLIKAKES 

dggrtkelekgkeekeikme:<tdetri^keaefeksakbnlrds 

KELRNFEALQI DD I MAI KM EDPKE IRKB ELEEDHKCSHFPDFS Y 

sasskiiisdvpsrkdhmchphgimiibdpaalnkpeklkkkxk 

KSKMDRHGNDKS TPKKTCKKRQSSESD I ESVIYTIEAVAKGDWG 
IEKLGDTPRKKVRTSSSGKGSILI>AKPPKKKVKSREKKMSKEKS 
SDTTKESRPPDFr S ISASKNI SGETPEGIKAEPLTPMEDALPPS 
LSGQAKPEDSDCHRKI ETCGSRKSERS CKGALYKTLVSEGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNBESWTFSQSGTSGSKKFKK 
TKPKED CLLGSAICLDEE FEKKFNS LPQ YS PVTFDRXCVPVPRKK 
KXTGNVSSEPTKTSKGSGDKWSNKQLFLDAIHPTEAIFSEDRNT 
MEPVHKVKNIPSI FNTPEPTTTARTFGGQPKEKSKENPDYSPCQ 
DTQRAGYHHEEVLWMTNIiMNNCGGVYLKQLRHTAMTWA 



"643T 



1295 



14T 



6437 



182B 



ALQRDAAAAYAHPE YEERFLQEET VSQQ INS IELLQTR PLAL PE 
I VVKSQRPLQRQVHLRGRPASQPTVTRGITYYKAKVSBEENDIEE 
QQDBFFSGDNGVDLLIEEXJLLRHNGLMTSVTRRPAATRQGHSTA 
VTSDLNARTAPWSSALPQPSTSDPSIANHASVGPTLQTTSVSPD 
PTR ESVLQPSPQ VPATTVAHTATQQPAAPAP PAVS PREALMBAM 
HTVPVPPTTVRTDSLGKDAPAGRGTTPASPTLSPBBEDDIRNVI 
GRCKDTLST I TGPTTQNTYGRNBGAWMKDPLAKDER IYVTNYYY 
j GNTLVE FRNLENFKQ3R WSNS YKLP YS H IGTGHWYNGAF YYNR 
| AFTRNIIKYDLKQRYVAAWAMLHDVAYEEATPWRWQGHSDVDFA 
VDBNGLWLIYPALDDEGFSQBVIVLSKLNAADJiSTQKETTWRTG 
LRRNFYGNCFVICGVLYAVDSYNQRNANISYAFDTHTNTQIVPR 
LLFENBYFYTTQIDYNPKDRLLYAWDNGHQVTYHV IFAY 

gacrpp vrqdpdsgpdyealpAgAtvt thmvagavagi lehcvm" 

YP IDCVKTRMQSLQPDPAAR YRNVLEALWR 1 1 RTEGLWR PMRGL 
NVTATGAGPAHALYFACYEKLKKTLSDVI HPGGNSHI ANGAAG C 

I VATLLHDAAMNPAEWKQRMQMYNSPYHRVTDCVRAVWQNEGAG 
AFYRSYTTQLTMNVPFQAIHFMTYEFLQBHFNPQRRYNPSSHVL 
SGACAGAVAAAATTPLDVCKTIiLNTQKSIi ALNSHITGH ITGMAS 

! AFRTVYQVGGVTAY FRGVQARVI YQIPSTAIAWSVYEFFKYLIT 
KRQEEWRAGK 

PPAPAPPASPARHVTRTARGHLBGGSRAPPLiiQAVFLQIKNMVK " 



LI HTLADHGDDVNCCAFS FSLLATCS LDXTIRL YS LRDFTELPH 
SPLKFHTYAVHCCCFS PSGHI LAS CS TDGTTVLWNTENGQMIiAV 
1 MBQPSGS PVRVCQFSPDSTCLASGAADGTWLWNAQSYKLYRCG 
S VKDGSLAACAFSPNGS FF VTG SS CGDLTVKDD KMRCLKSEKAH 
DLG I TCCDFS S QPVSDGEQGLQFFRLASCGQDCQ VKI WI VSFTH 
ILGFELKYKSTLSGHCAPVLACAFSHDGQMLVSG3VDKSVIVYD 
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seq 

ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide "' 
(A= Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenyl alanine. G=Glycine, 
H=*Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P«Proline, Q=Glut amine, R-Arginine, 
S=Serine, TaThreonine, V- Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








TNTSNILHTLTQHTRYVTTCAFAPNTIilATGSMDKTVNIWQFD 
LBTLCQ ARSTEHQLKQFTEDWS E ED VSTWLCAQDLKDIiVG I FKM 
NNIDGKEDLNLTKESLADDLKIBSLGLRSKVLRKIEELRTKVKS 
LSSGIPDEPICPI TRELM KDPV1 AS DGYS YEKE AMENWDPAKRN 
RTSPP 


6438 


" " 109 


901 


E VQ I LRAKMFQTGGLI V P YGLLAQTMAQ FGGLP VPLDQTLPLNV 
NPALP LS PTGLAGSLTNALSNGLLSGGLLGI LENLPL LD I LKPG 
GGTSGGLLGGLLGKVTS VI PGLNNI IDIKVTDPQLLELGLVQSP 
DGHRLYVTI PLGI KLQVNTPLVGASLLRLAVKLDITAEILAVRD 
KQERIHLVLGDCTHSPGSLQISLLDGLGPLPIQGLLDSLTGILN 
KVL PE LVQGNVCPLVNE VLRGLD 1 TL VHDI VNML I HGLQF VI KV 


6439 


23 


412 


SIQTASAITTEMASQSQGICK3LLQAEKRAAEKVADARKRKARRL 
KQAKEEAQMBVEQYRREREHBFQSKQQAAMGSQGNIiSAEVEQAT 
RRQVQGMQSSQQRNRERVLAQLLGMVCDVRPQVHPNYRISA 


6440 


3 


517 


R^MNSDMGDLPGLVRLSIALRIQPNDGPVPYKVDGQkPdONRT 
I KLLTGS S YKVEVKI KPSTLQVENIS IGGVLVPLELKSKEPDGD 
RWYTGTYDTEGVTPTKSGERQP I Q I TMPFTDIGTFETVWQVKF 
YNYHKRDHCQWGSPFSVIEYECKPNETRSLMWVNKESFL 


6441 


234 


1373 


KSGGLRRRQRPGRSAAVGEEELPPGMEkFKAAMLLGSVGDALGY 
RNVCKENTSTVGMKIQEELQRSGGLDHLVLSPGEWPVSDNT1MHI 
ATABALTTDYWCLDDLYREMVRCYVEIVEKLPERRPDPATIEGC 
AQLKPNNYLLAWHTPFNEKGSGFGAATKAMCIGLRYWKPERLET 
LIEVSVECGRMTHNHPTGFLGSLCTALFVSFAAQGKPLVQWGRD 
MLRAVPrAEEYCRKTIRHTAEyQEHWFYFEAKWQFYLEEJRKISX 
DSENKAIFPDNYDAEERBKTYRKWSSEGRGGRRGHDAPMIAYDA 
LLAAGN S WTELCHRAMFKGGESAATGTI AGCLFGLL YGLDLVP K 
GXiYQDLEDKEKIiEDIiGAALYRLSTEEK 


6442 


34 


796 


AEDPAGGLAGQDTMFARGLKRKCVGH E ED VEGALAGLKTVS SYS 
LQRQSLLDMS LVKLQLCHMI.VEPNLCRS VLI ANTVRQ IQEEMTQ 
DGTWRTVAPQAAERAPLDRLVSTEI LCRAAWGQEGAHPAS GLGD 
GHTQGP VSDLCP VTSAQAPRHLQSSAWEMDG PRENRGS FH KSLD 
QIFETLETKNPSCMEELFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLEGLAPATPGPSS S CKSDLGELDKWE ILVET 


6443 


2 


555 


maspaassvrpprpkxepqtlvipknaaeeqklklerlmknpdk" 

AVPIPEKMSEWAPRPPPEFVRDVMGSSAGAGSGBFHVYRHLRRR 
EYQRQDYMDAMAEKQKLDAEFQKRLEKNKIAAEBQTAKRRKKRQ 
KLKEKKLIiAKKMKLEQKKQEGPGQPKEQGSSSSAEASGTEEEEE 
VPSFTMGR 


6444 


390 


899 


GSTPRGKMRAPIPEPKPGDLIEIFRPFYRllWAIYVGDGYVVHIiA 
P PS EVAGAGAAS VMS ALTDKAI VKKELLYDVAGSDKYQVNN KHD 
DKYSPLP CSKI I QRAEELVGQE VL YKLTS ENCEHFVNELRYGVA 
RSDQVRDVI IAAS VAGMGLAAMSLIGVMFSRNKRQKQ 


6445 


2 


753 


agaagaagaarsprpqahtkgvrglpsrrrspdcgrmeiAagsf 

S EEQFWEACAELQQPALAGADWQLLVETSGIS I YRLLDKKTCLY 
Kriwr ^ViEDCSPTLLADIYMDSDYRKQVJDOYVKELYEQECNGE 
TVVYWEVKYPFPMSNRDY/VYLRQRRDLDMEGRKIHVIIAR3TSM 
PQLGERSGV1 RVKQ Y KQSLAI E SDG KKGS KVFMYYFDNPGGQI P 
S Wbl NWAAKNG VPNFLKDMARACQM YLKKT 


6446 


1 


1651 


KCPXRSPPPDTPGSRGTTAMCSLASGA'TOGkGAVENEEDLPELS 
DSGDEAAWEDEDDADLPHGKQQTPCLFCNRIiFTSAEETFSHCKS 
EHQPNIDSMVHKHGLEFYGYIKLINFIRLKNPTVEYMNSIYNPV 
PWEKEBYLKPVLEDDLLLQPDVBDIiYEPVSVPFSYPNGLSENTS 
VVBKLKHMEARALSAEAAIARAREDI^KMKQFAQDFVMHTDVRT 
CS S STSVIADLQEDEDG VYFSS YGHYGIHEEMLKDKI RTES YRD 
FI YQNPH I FKDKWLDVGCGTG1 LSMFAAKAGAKKVLGVDQSE I 
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SEQ 
ID 

NO: 


~| Predicted 
beginning 
nucleotide 
location 
corresponding 

amino acid 
residue of 
amino acid 

rami a n <** a 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segmenc containing signal peptTdT~ 
{AnAlanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glyci.ne, 

I HoHistidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q-Glutainine, R=Arginine, 

I SoSer ine, T=Threonine, V- Valine, 
W»Tryptophan, Y-Tyrosine, X=Unknown, *»stop 
Codon, /^possible nucleotide deletion, 

1 \=possible nucleotide insertion) 


6447 






lyqamdiirlnkledtitlikgkieevhlpvekvdvusewmgV 

FLLFESMLDSVLYAIOnCYLAXGGSVYPDICTISLVAVSDVNKHA 
DRIAFWDDVYGPKMSCMKKAVIPEAWEVLDPKTLISEPCGIKH 
I DCHTTS I SDLBPSSDPTLKITRTSMCTAIAGYFD1 YFEKNCHN 
R WFS TG PQS TKTH WKQTVFLL E KPPS VKAGEALKGKVTVH KNK 
[ KDPRSLTVTLTLNNSTQTYGLQ 


6448 


1554 


1068 


RLGPAEWHI^GPCHATLGAANRGRALGVi^WRGAPlidQRVMMP 
SRTNLATGI PSS KVK.YSR1SS TDDGY1DLQPKXT2P KI P YKAXA 

LATVLPHGAPLIIIGSLLLSGYISKGGADRAVPVLIIGILVFL 
PGFYHLRIAYYASKGYRGYSYDDIPDPDD 


6449 


74 


559 " 


GQVLS HCYH YRSS RWRRGGLS RGRGAGVMAIiVPYEETTE FGLQK 
FHKPLiATFS FANHT I Q IRQD WRHLGVAAWWDAA I VLS TYLEMG 
AVELRGRS AVEIX3AGTGLVG I VAALLACRI R YERDNN FLAMLER 
QFIVRKVHYPPEKDVHIYEAQKRNQKEDL 


6450 


597 


1876 


EyGVCENLRKJjEITGVSCRDVYAKLLHRYRHjt/SLWQPDIGPYG 
GLLNWVDGLFIIGWMYLPPHDPHVDDPMRFKPLFRIHLMERKA 
AT VECM YGHKGPHHGHI Q I VKKDE FSTKCNQTDHHRMSGGRQBE 
FRT WLREBWGRTLED I FH EHMQEL I LMKF I YTSOYDNCIiTYR T? T 
YLP PSRPDDLIKPGLFKGTYGSHGLEI VMLS FHGRRARGTKITG 
DPNIPAGQQTVEIDLRHRIGLPDLENQRNFNBLSRIVLEVRERV 
RQB QQEGGHEAGEGRGRQGPRESQ PS PAQPRAEAPSKGPDGTPG 
| EDGGEPGDAVAAAEQPAQCGQGQ PFVLPVGV S SRNEDYPRTCRM 
i CFYGTGL I AGHGFTSPE RTPGVF ILFDEDRFGF VWLELKS FSLY 
SRVQATFRNADAPSPQAFDEMLKNIQSLTS 




84 8 


269 


tf V PAPRTVS G KRS LPGE WEERG EGEQRTGRB FSGNGGRAVE AAR 
KRI*L CG LWLW LSbLKVLQAQTPTPIi PLPPPMQS FQGNQFQGEWF 
VLGLAGNSFR P EHRALLNAFTATFB LSDDGR FE VWNAMTRGQHC 
DTWS YVLI PAAQPGQFTVDHRVWTHEQAGR PQDQPAGQBLVAAS 
RDAG PVHLPGQSSGPLG 


6>451 
64S2 


232 


939 


HbPTPPTSPKASTMEDVKLEFPSLPQCKEDAEEWTYPMRREMQE 
rLPGLFLGPYSSAMKSICLPVLQKHGITHIICIRQNlEANFIKPN 
FQQL FRYLVLD I ADNP VEN I IRPFPMTKEFTOGSLQMGGKVLVH 
GNAGISRSAAFVIAYIMETFGMKYRDAFAYVQERRFCINPWAGF 
VHQLQEYEAIYLAKIiTIQMMSPLQIERSLSVHSGTTGSLKRTHE 
EEDDFGTMQVATAQNG 


6453 


1 


652 


RTRGESSNMEPtAA YPLKCSGPRAKVFAVLDS IVLCTVTLFLLQ'" 
t* KFLKP KIWSF YAFE VKDAKGRTVS LEK YKGK VSL WNVASDCQ 
LTDRNYLGLKELHKEPGPSHFSVLAFPCNQFGESBPRPSKEYES 
FARKNYGVTFPIFHKIKILGSEGEPAFRFLVDSSKKEPRWNFWK 
YLVNPEGQWKFWRPEEPIEVIRPDIAALVRQVI I KKKBDL 


6454 " 


827 


223 | 


HKKW1,PGLSMSPRRTLPPJ>LSLCLSLCLCLCLAAALGSAQSGSC 
RDKKNCKVVFSQQBLRKRLTPLQYHVTQBKGTESAFEGEYTHHK 
DPGIYKCVVCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FS YGMHRVETSCSQ CGAHLGH I FDDGPR PTGKRYC INS AALS FT 
P ADSSGTAEGGS GVAS PAOADKAEr, 


6455 


827 


223 J 


MKKWXiPGLaMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSGSC 
RDmJCKVVFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGrYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 




1042 


173 

1 


RVHiATVSASAAWDAIiGLPVRSHMQGSTRRMGVm'DVHRRFLQi; 
LMTHGVLEEWDVKKLQTHCYKVHDRNATVDKIjEDFIKNINSVIjE 
3LYIEI KRG VTEDDGRPI YALVNLATT SIS XMATDFAE NELDLF 

RKALELIIDSETGFASSTNILNLVDQLKGKXMRKKEAEQVLQKF 
/QWKWLIEKEGEFTIiHGRAILEMEQYIRETYPDAVKICNICHSL 



506 



WO 01/53312 



PCTYUS00/34263 



SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted! end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid cnr\f*n h rnnf ninSnrr «= <i tt «-> j» 1 n Q ,%i- -r An 
nviu aeymcnu uvjall. tuning signax pep Ci.de 

(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 

SUHistidine, I=Isoleucine, K=Lysine, 

L=Leucine, M=Methionine, N=*Asparagine, 

P= Proline, Q=Glutamine , R»Arginine, 

S=Serine, T=Threonine, VoValine, 

W»Tryptophan, Y«Tyrosine, X^Unknown, *sStop 

Codon, /^possible nucleotide deletion, 

\«possible nucleotide insertion) 








1*3 QGQS CETCG I RMHLP CVAK Y FQ SNAE PRC PHCNDY W P tiE" 1 P K 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 




'555 


P J3f~>QT?QT QMtiTDftfQT T DUt wtrfftmrtmYT /•> tt» r> r"V t"V rrrT/-TT /WW ""* 
i^rya^Aji'lwlu^oJLjJL^VooaljRWLKVL/WVlJll^l^ 

TVEAEAALQNKVVALYFAAARCAP SRDFTPLLCD FYTALVAE AR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWIiALPFHDPYRHELR 
KRYNVTAI PKLVIVKQNGEVITNKGRKQ IRERGLAC FQ D WVBAA 
DIFQNFSV 


6457 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKI IHFPDFDKKI PV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQ YSLNI ILSVFAI ILGAFIAAGSDLAFNLEGY I FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFM 1 1 PTL I XSVSTG 
DLQQATEFNQWIQWVFILQFLLSCFLGFLLMYSTVIiCSYYNSAL 
TTAWGAIKNVS VAYIG ILIGGDYI FSLLNFVGLNI CMAGGLR Y 
SFLTLSSQLKPKPVGBENICLDLKS 


6458 


23 


892 


PlWFPVOWFPWNWPDGKPPIMILYVfllttAIKiitlFPDFDKKIPV " 
KLFPLPLLYVGNHISGLSSTSKXSLPMFTVLRKFTIPLTLLLET 
I ILGKQYSLNI ILSVFAI I LGAFI AAG SDLAFNLEG Y I FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFM I IPTLI I SVSTG 
DLQQATEFNQ WXNWF ILQFLLS CFLGFLLMYSTVLCS Y YNS AL 
TTAWGAI KNVS VAY IG I L IGGDY I FS LLNFVGLN I CMAGGLR Y 
SFLTLSSQLKPKPVGEENI CLDLKS 


6459 


23 


892 


PTTG FPVTNFPWNWPDGKPP I MILYVSKLNKI IHFPDFDKKIPV 
KLFPLPLLYVGNHI SGLSS TSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQ YSLNIILS VPAI I LGAFI AAGS DLA FNLEGY I FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLPYNACFMI IPTLI I SVSTG 
DLQQATEFNQ WKNWFJELQ FLLS C FLGFLLMYST VLCS YYNS AL 
TTAWGAI KNVS VAY IGI LIGGDY I FSLLNFVGLNI CMAGGLRY 
SFLTLSSQLKPKPVGEENI CLDLKS 


6460 


23 


892 


PTTGF P VTN F PWNWPDGKP P IM I L Y Vs KLNK I 1 H FP DF"bkki! P V 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQYSLN 1 1 LSVFAJ I LGAFI AAGS DLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI I SVSTG 
DLQQATE PNQWKNWFILQ FLLSCFLGFLLMYSTVLCSYYNSAL 
TTAVVGATKTJVQVAVTr?TT»Tf?f;nVTPQT.r.Kr , B^rr:T.>3Tr'Mnr*r>T t>v 

X Art v VUAXlui VOVnl J.VjJ.JJi.V3oL/I Xf O-ULlNr VLjJjIN 1 1 |»i f\l tit I >K Y 

SFLTLSSQLKPKPVGEENI CLDLKS 


6461 


1653 


360 


LQQRTLRITAVGQTHPIAWMAWBPSLGAFYGPASFITFVNCMYF 
LS I F I QLKRHPERKYELKEPTE EQQRLAANENGE INHQDSMS LS 
LI STSALENEHT FHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS FVFGATS LSFS AFF WHHCVNRBD VRLAW I MTCCPGRS S 

GCKLTNLQAAAAQCHANSLPLN5TPQLDNSLTEHSMDNDI KMHV 
APLEVQFRTNVHSSRHHKNRS KGHRASRLTVLREYAYDVPTS VE 
GSVQNGLPKSRLGNNEGHSRS RRAYLAYRERQYNPPQ QDS SDAC 
STLPKSSPJTFEKPVSTTSKKDALRKPAVVELENQQKSYGLNLAI 
QNGPI KSNGQEG PLLGTDSTGNVRTGLWKHETTV 


6^2 


3 


773 


S EELDRE KKLKEDS PRKTPNKESGVPSL P VSLTS I KEEPKEAXH 
PDSQSMEES KLKNDDRKTPVNWKDSRGTRVAVSSPMSQHQSY IQ 
YLHAYP YPQM YDP SHPAYRAVS PVLMHSYPGAYLS PGFHYPVYG 
KMSGRE ETE KVNTS PS VNTKTTTES KALDLLQQHANQ YRS KS PA 
PVEKATAEREREAER3RDRHS PFGQRHLHTHHHTHVGMGYPL I P 
GQYD P FQGLTSAALVASQQVAAQAS ASGM FPGQRRE 


6463 


2 


350 


VILCILGGWIFKNADRSMEKKKGEPRTRAEARPWVDEDLKDSSD 
LHQAEEDADEWQESEENVEHIPFSHNHYPEKEMVKRSQEFYELL 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 


12 


1154 


GILRQKEREERNRIHKKEILFLEHLLVVPSEMSSLSGKVQTVLG 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide"" 
(A-Alanine, C=Cysteine, 0=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HoHistidine, Itlsoleucine, K=Lysine, 
L*Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q»Glut amine, R^Arginine, 
S=Serine, T=Threonine, V« Valine, 
WaTryptophan, Y=Tyrosine, X-Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








LVEP SKLGRTLTHEHLAMTFDC CYCPPPPCQEAISKEPI VMKNL 
YWIQKNAYSHKBNIiQLNQETKAIKEELLYFKANGGGALVENTTT 
GISRDTQTLKRLABBTGVHIISGAGFYVDATHSSETRAMSVEQL 
TDVLMNBILHGADGTS I KCGI IGEIGCSWPLTESERKVLQATAH 
AQAQhG CP VI IHPGR SSRAP FQI IR ILQEAGADISKTVMSHLDR 
TILDKKELLEFAQLGCYLEYDLFGTELLHYQLGPDIDMPDDNKR 
IRRVRLLVBEGCBDR I LVAHDIHTKTRLMKYGGHGY^HTT^TMW 
PKMLLRGITBNVLDKILIENPKQWLTFK 


6465 


126 


1356 


W^FFKTLRNHWKKTTAGLCLLTWGGHWLYGKHCDCrLLRRAAC 
QE AQVFGNQL I PPNAQVK KATVF LNPAACXG KAR TT »fp tf wn, iot 

LHLSGMDVTIVKTDYEGQAKKIJLEI^ENTDVIIVAGGDGTLQEV 
VTGVLRRTDEATFSKIPIGFIPLGETSSLSHTLFAESGNKVQHI 
TDATLAIVKGETVPLDVLQIKGEKEQPVFAMTGLRWGSFRDAGV 
KVS KYWYLE PLK I KAAH F FSTLKE WPQTHQAS I S YTG P TERP PN 
EPEETPVQRPSLYRRILRRLASYWAQPQDALSQEVSPEVWKDVQ 
LS TI ELS I TTRNNQLD P TSKEDFHTI CIEPDT ISKGD F I TIGSR 
KVRNPKLHVEGTECLQASQCTLLI PEGAGGS FS IDSEEYEAMPV 
BVKLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VAKCi'l'kLSQLEKAHPPADKGRRKSKRKPPPKXKMTGTLETQPXC 
PFCNHEKSCDVKMDRARNTGVlSCTVCLEEFQTPITYIiSEPVDV 
YSDWIDACEAANQ 


6467 


301 


2571 


GELRVLALAHGELACHAVLTASLLSLRSRLMDSDMDYERPNVET 
IKCVVVGDKAVGKTRL I CARACWATLTQYQLLATHVPTVWAI DQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
VVLCFSIANPNSLHHVX'TMWYPEIICHFCPRAPVILVGCOLDLRY 
ADLEAVNRARRPLARPIKPNE ILPPE KGREVAKELGI PYYETSV 
VAQ FG I KDVPDNAI RAAL I S RRH LQ F WKS HLRNVQ R PLLQ AP FX> 
PPKPPPPIIWPDPPSSSEECPAHLLEDPT.ranvTT vt rnroimT 
FAHKIYLSTSSSKFYDIiFLMDLSBGELGGPSBPGGTHPBDHQGH 
SDQHHHHHHHHHGRDFLLRAASFDVCESVDEAGGSGPAGLRAST 
SDGILRGNGTGYLPGRGRVLSSWSRAFVSIQEEMAEDPLTYKSR 
I^IVVVKMDSSIQPGPFRAVLKYLYTGELDENBRDLMHIAHIAEL 
LBVFDLRMMVANI LNNEAFMNQEITKAFHVRRTNRVKECLAKGT 
FSDVTFrLDDGTISAHKPLLISSCDWMAAMFGGPFVESSTREW 
FPYTSKSCMRAVLEYLYTGMFTSSPDLDDMKLIILANRLCLPHL 
VALTE QY T VTG LMEATQM M VD I DGD VLVFL ELAQ FHCA YQLAD W 
CLHH3 CTNYNWVCRKP PRDMKAMSPENQEYFE KHR WPP VWYLKE 
EDHYQRARKEREKEDYLHLKRQPKRRWLFWNSPSS PSSSAASSS 
SPSS5SAW 


6468 
*4 69 


3 


1374 


AWAGTNMAALAPVGS PAS RGPRLAAGLRLLPMLGLLQL LAEPG 
LGRVHHLAIiKDDVT^KVHLNTFGFFKDG YMVVNVS SLSLNB PED 
KD VTI GFS LDRTKNDGFSS YLDEDVN YCI LKKQS VS VTLLI LD I 
SRSEVRVK3PPEAGTQLPKIIFSRDEKVLGQSQBPNVNPASAGN 
QTQKTQDGGKS KRSTVDS KAMGEKSFS VHNNGGAVSFQFFFNI S 
WDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
GElPLPKIiYISMAFFFFLSGTIWIHII^KRRNDVFKIHWLMAAL 
PFTKSLS LVFHAI DYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAY1IIES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPWWSIRHLQEASATD 
GKGKFSRAHFVLLSLL 




3 


1374 


DAWAGTNMAAIAPVGSPASRGPRLAAGIiRLLPMW 

LGRVHHLALIGJDVRHKVHLNTFGFFlulGYM\A/JXV^ 

KDVTIGFSLDRTKNDGFSS YLDEDVN YCILKKQSVSVTLLILD I 

SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 

QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFOFFFNIS 

TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 



508 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding" 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K= Lysine. 
L=Leucine, M=Methionine , N^Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S^serine, T= Threonine, v«Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknovn, *»Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








GEIPLPKIiyiSMAFFFFLSGTIWIHllikKRRNDVFKIHWt.MAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYVITHLLKGALLF 
ITIAL IGTGWAFIKHI LSDKDKKI FMI VI PRRVLANVAYI I IBS 
TEEGTTE YGLWKDSL KLVDLLCCGAILFP VVWS I RH LQJ5 ASATD 
GKCKFSRAHFVLLSLL 


6470 


2726 ■ 


1437" 


AAASGVSS RADAPVLAQS PAS AGNGRPSTPRVPGSRRHPSAPRS "" 
GPL PREDGCRTPG PQLLPLPG ALLR PRTLLSS AAETGRS RHFDT 
QHPSSGGRCRGGTBS PSSAAGRPASMAEAEEDCHSDTVRADDDE 
BNESPABTDLQAQLQMFRAQWMFELAPGVSSSNLENRPCRAARG 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPDXEFKITYTRSPDGDGVGNSYIEDNDDDSKMADLLS 
Y FQQQLTFQES VL KLCQ PELES SQIHI S VLPMEVLM Y I FRWWS 

SDLDLRSLEQLSLVCKGFYICARDPEIWRLACLKVWGRSCIKLV 
PYTSWREMFLERPRVRFDGVYISKTTYIRQGEQSLDGFYRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


295 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRR'RRR 

GPRNKKRGWRRLAQEPLGLEVDQFLEDVRLQERTSGGLLSEAPN 

EKLFFVDTGS KEKGLTKKRTKVQKKSLLLKKPLRVDLILENTSK 

VPAPKDVLAHQVPNAKKLRRKEQLWEKLAKQGELPREVRRAQAR 

IiLNPSATRAKPGPQDTVERP F YDLWASDNPLDR PLVGQD B FFLE 

QTKKKGVKR PARLHTK PSQAPAVE VAPAGAS YWPS FEDHQTLLS 

AAHEVELQRQKEAEKLERQLAIiPATEQAATQES TFQELCE G LLE 

ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKKTEGQRRREKA 

VHRIjRVQQAALRAARLRHQELFRLRGIKAQV7U*RIJUiXAR 

RQARREAEADKPRRLGRLKYQAPDIDVQLSSEI.TDSLRTLKPEG 

NILRDRFKSFQRRNMIEPRERAKFKRKYKVKLVEKRAFREIQL 


6472 


3 


897 


SCGSDRAQWAMEFPFDVDALFPERITVLDQHLRPPARRPGTTTP 
ARVDI^QQIMTIIDEliGKASAKAQNLSAPITSASRMQSNRHVVY 
ILKDSSARPAGKGAIIGFIKVGYKKLFVLDDREAHNEVEPLCIL 
DFYIHESVQRHGHGREJCFQYMLQKERVEPHQLAlDRPSQKIiLKF 
LNKH YNLETTVP QVNNFV I FEGFFAHQHRP PAPSIJRATRHSRAA 
AVDPTPAAPARKLPPKRAEGDIKPYSSSDREFLKVAVEPPWPLN 
RAPRRATP PAHP P PRS S S LGNS PERG PLRPFVP 


6473 


22 


912 


aSAVEFVWEGBKMAAEPNKTBIQTLFKRLRAVPTNKACFDCGAK'" 
NPSWAS ITYGVFLCIDCSGVHRSLGVHLSFIRSTELDSNWNWFQ 
LRCMQVGGNANATAFFRQHGCTANDANTKYNSRAAQMYREKIRQ 
LGSAALARHGTDLWIDNM3SAVPNHS PEKKDSDFFTEHTQ P PAW 
DAP ATEPSGTQQPAPS TES SGLAQPBHGPNTDLLGTS PKAS LEI» 
KS S I IGKKKPAAAKKGLGAKKGLGAQ KVSSQS FSE IERQAQ VAE 
KLREQQAADAKKQAEESMVASMRLAYCELQIDR 


6474 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDI VI MPKRXS PENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKtSRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNV3TSRGTP 
PSTLSVKGQIETVRVKGTEN 


6475 


3 


462 


LORQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 

KGKKEBKQEAGKEGTAPSENGETKAEBIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6476 


106 


1090 


ARAMAQ Y KGTMREAG RAMHLL KKRERQREQME VLKQR IAEET I L " 
KSQVDKRFSAHYI^VEAELKSSTVGLVTLNDMKARQEALVRERE 
RQXtAKRQHLE EQRLQ Q2RQREQEQRRERKRKISCLS FALDDLDD 
QADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAQ REKVKDE EMEVT FS YWD GSGHRRTVRVRKGNTVQQFLKKAL 
QGLRKDFLBLRSaGVEQLMFIKEDLILPHYHTFYDF I IARARGK 
SG PLFSFDVHDDVRLLSDATMEKDESHAGKVVLRSWYEKNKHIF 
PASRWEAYDPEKKWDKYTIR 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to iij.Su 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sp onding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CoCysteine, D-Aspartic Acid, B= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MsMethionine, N=Asparagine, 
P=Proline, Q=Glut amine, R-^Arginine, 
S»Serine, T=Threonine, VaValina, 
W«Tryptophan, Y-Tyrosine, X-Vnknown, *=Stop 
Codon, /^possible nucleotide deletion, | 
\«pos3ible nucleotide insertion) 


6477 


227 


915 


LQGHI^GIMAASRPLSR^BWGKNIVC^QRNYADH\mKMRSAVL 
SEPVLFLKPSTAYAPEGSPILMPArTR^HBLELGVVMGKRCR 
AVP EAAAMDYVGGYALCtDMTARDVQDECKXKGLPWTLAKS FT A 
SCPVSAFVPKEKIPDPHKLKLWLKVNGELRQEGETSSMIFSIPY 
IIS YVSKI I TLEEGD 1 1 LTGTPKGVGPVKENDE I EAG I HGLVSM 
TFKVEKPBY 


6478 




1495 


WSSRILPESLASSEASTLEAMGRKEEDDCSSWKKO^TNIRKTF 
I FMEVLGSGAFS E VFLVKQRLTG KL FAL KCIKKS P AFRDS SLEN 
EIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RG V YTE KDAS LVIQQVLS AVKYLHENGI VHRDLKPENLL YLTPB 
ENSKIMITDFGWKMEQNGIMSTACGTPGYVAPEVj^QKPYSKA 
VDCWSIGVITYILLCGYPPFYEETESKLFEKIKEQYYEFESPFW 
DD I SES AKDF I CHLLEKD PNE RYTCEKALS H P W I D3NTALHRDI 
YPSVSLQIQKNFAKSKWRQAFNAAAVVHHMRKLHMNLHS pgvrp 
BVEWRPPETQASETSRPSSPEITITEAPVLDHSVALPALTQLPC 
QHGRRPTAPGGRSIjNCLTOGSLHISSSLVPlvlHaGSLAAGPCGCC 
SSCLNIGSKGKSSYCSEPTLLKKANKKQNFKSEVMVPVKASGSS 
HCRAGQTGVCXIM 


6479 


3 


949 


SCRGP GWHPAGGQAGAMELLSALS LGELALS FS R VPLPP V FDLS 
YFIVSILYLKYEPGAVEI^RRIIPIASWIiCAMIjHCFGSYILADLL 
LGEPLIDYFSNNSS ILLASAVWYLI FFCPLDLFYKCVCFLPVKL 
IFVAMKEVVRVRKIAVGIHHAHHHYHHGWFVMIATGWVKGSGVA 
I^SNFEQLLRGWi^ETNBIliHMSFPTKASLYGAILFTLQQTRW 
LPVSKASLI FIFTLFMVSCKVFLTATHSHSSPFDAIiEGYI CPVL 
FGS ACGGDHHHDNHGGS HSGGGPGAQHS AMP AKS KEELS EG S RK 
KKAKKAD 


64 BO 


192 


514 


DFMS I YFPlHCPDYIiRSAKMTEVMMNTQ PMEEI GLSPRKDGLS X 

QIPPDPSDFDRCCKLKDRLPSIWEPTEGEVESGELRWPPEEFL 
VQEDEQDNCEETAKENKEQ 


6481 
6482 


110 


1131 


KSRMDIJ)VVIWFVIAGGTIjAIPILAFVASFLLWPSALIR 

WRRTIK3MQVRYVHHEDYQFCYSFRGRPGHKPSILMLHGFSAHKD 

MWLSVVKFLPKNLHLVCVDMPGHEGTTRSSLDDLSIDGQVKRIH 

qfveclkwjkkpfhlvgtsmggqvagvyaayypsdvsslwlvcp 
aglqystdnqi^qrlkeiogsaavekiplipstpeemsemlqlc 
syvrfkvpqqii^lvdvriphnnfyrklfleivseksryslhq 

NMDK I KVP TQ 1 1 WG KQ DQ VLD VS GAD MLAKS I ANC Q VELLEN CG 

hswkerprxtakli I dflasvhntdnnkkld 




2517 


568 


epvskvsqsrrkagvptanieesqAveaamanvpwaevcekfqa 
alalsrvelhknpekbpytcskysaralleevkallgpapededb 
rpeaedg p gagdhalglpaewe peg p vaqravrlavi efhlg v 

NHIDTEELSAGEEHLVKCLRLLRKYRL3HDCI3LCIQAQNNLG1 

lwsereeietaqaylessealynqymkbvgsppldpterflpee 
eklteqersk^fekvythnlyylaqvyqhlemfekaahychstl 
krqlehnayhpiewainaatlsqfyinklcfwearhclsaanvi 

FGQTGKISATBDTPEAEGEVPELYHQRKGEIARCWIFCYCLTIrMQ 

GELCDAISAVEEKVSYLRPLDFEEARELFLLGQHYVFEAKEFFQ 
IDGYVTDHIEWQDHSALFKGLAFFETDMERRCKMHXRRIAMLE 
PLTVDLNPQYYLLV1TOQIQFEIAHAYYDMMDLKVAIADRLRDPD 
SHIVKKIKWLNKSALKYYQLFLDSLRDPNKVFPEHIGEDVLRPA 
MLAKFRVARLYGia ITADPKKELENLATSIjEHYKFIVDYCSKHP 
EAAQB IEVELEjUSKEMVS LLPTKMERFRTKMALT 


6483 


3 


623 


NSHI>LCX5LRARAPLSANGREARAMEQR 

PAASQGAQTPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 
SLVQEAAQPO^STSETPWNTAIPLPSCWDQSFLTNITFLKVLLW 
LVLIX3I^BLEFGIAYFVI,SXiFYWMYVGTRGPBSKKEQEK3AYS 



510 
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£JBq 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
r-troime, Q=GlUw amine, R=Argxnine, 
S=Serine, ToThreonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6464 


201 


965 


VFNPGCBAIQGTLTAEQLERELQLRPLAGR 
QLAVKTKMSGLRPGTQVDPEIELPVKAGSDGESIGNCPFCQRLP 
MILWLKGVKFNVTTVD^RKPEELKDIAPGTNPPFLVYNKELKT 
DFIKIEEFLEQTLAPPRyPHLSPKYKES PDVGCNLFAKFSAYI K 
NTQKEANKN PE KSLLKEPKRLDD YLNTPLLDE ID PDS AEEP PVS 
RRLFLDGDQLTLADCSLLPKLNI IKVAAKKYRDFDI PAEFSGVW 
R YLHNAYARE EFTHTCP EDKE I ENTYANVAKQ KS 


6485 


6 


1091 


FVDLVRAVEFLPCPDSQKLEKBCQSSEESMGSNSMRSILEEDEE " 
DE E P PR VU »YHBPRS FEVGMLVWHKHKKY PFWPAWKS VRQRDK 
KASVLYI EGHMNPKMKGFTVSLKSLKHPDCKEKQTLLNQAREDF 
NQD IGWC VS L I TDYRVRLGOGS FAGS FLEYYAADI S Y PVR KS IQ 
QDVLGTKLPQLSKGSPE2PWGCPLGQRQPCRKMLPDRSRAARD 
RANQKLVEYIGKAKGAESHLRAILKSRKPSRWIjQTFLSSSQYVT 
CVETYLEDEGQLDLWKYLQGVYQEVGAKVLQRTNGDRIRFILD 
VLLPEAI ICAI SAGDEVDYKTAEE KYIKGPSLS YRE KEI FDNQL 
LEERNRRRR 


6486 


10 


581 


LVLQAGGAHLSPSRVTQGIYVMLAFSEMPKPPDYSELSDSLTLA 

ggtgrfsgplhramrmmnfrqrmgw igvglyllasaaafy YVFE 

I SETYNRLALEH IQQHPEE P LEGTTWTHSLKAQLLSLP FW VWTV 
I FLVP YLQMFLFLYS CTRADPKT VG YCI 1 P 1 CLAV ICNRHQAFV 
KAS NQ I SRLQL IDT 


6487 


352 


863 


SFLKPLRGKMSVTLHTDVGDIK1BVFCERTPKTCENFLALCASN " 
YYNGCI FHRN I KGFMVQTGD PTGTGRGGNS I WGKKFEDE YSEYL 
KHNVRGVVSMAKNGPNTNGSQFFITYGKQPHLDMKYTVFGKVID 
GLETLDELEKIiPVNEKTYRPLNDVHI KDITIHANPFAQ 


6489 


878 


241 


TALQEFGTSGPPL3LRFALPSGTGRFKPLFGARGPSWPPSPRVP"" 
MEPPNLYPVKLYVYDLSKGLARRLSPIMIjGKQLEGIWHTSIVVH 
KDEF F FGSGG IS S CPPGGTLLGP PDS WDVGS TEVTEE I FLB YL 
SS LGESLFRGEAYNLFEHNCNTFSNE VAQFLTGRK3 PS YI TDLP 
SEVLSTPFGQALRPLLDS IQI QP PGGS S VGRPNGQS 


6489 


1457 


375 


KVAKMATALSEEELDNEDYYSLLWVRREASSEELKAAYRRLCML 
YHPDKHRDPELKSOAERLFNLVHQAYEVLSDPQTRAIYDIYGKR 
GLEMEGWEWERRRTPAEIREEFERLQREREERRLQQRTNPKGT 
I S VGVDATDLFDR YDEE YEDVSGSS FPQI EINKMHI SQS I EAPL 
TATDTAI IjSGSLSTQNGNGGGS INEAIjRRVTSAKG WGELE FGAG 
DLCGPLFGLKLFRNLTPRCFVTTNCALQFSSRGI RPGLTTVLAR 
NI^KNTVGYIiQWHCSS PLLQ VQRPHR1ITRACAPB PS FRPFLHVP 
TWUAKCSGARTPS TAKTS AAVKLREACLSGPGSGS HQLLLLTP R 
SKRRTGGG 


6490 - 


3 


1183 


HEAG CE VWLG YGP RAAAAAAATVLFGGAG PTETM FVARS XAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDVmCTASWKT 
HSGSVWRVTWAHPEFGQVLAS CSFDRTAAVWEB I VGESNDKLRG 
QSHWVKRTTL VDSRTS VTDVKFAPKHMGLMLATCSADG 1 VRI Y2 
APDVhINLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DS S PNAMAKVQ I FEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
1^SQVWRVSVWITGTVIJ\SSGDIX3CVRLWKANYMDKWKCTGIL 
KGNGSPVNGSSQQGTSNPSLGSNIPSLQKSLNGSSAGRKHS 


6-491 " 


3 


1183 


HEAGC^VWLGYGPWVAAAAAATVLFGGAGPTETMFVARSIAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASVJKT 
HSGSVWRVTWAHPEFGQVLAS CS FDRTAAVWEEI VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGIVR I YE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DS SPNAMAKVQIFEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFH I LA IATKDVRI FTLKPVRKELTSS GGPTKFE I H I VAQFD 
NHNS QVWRVS WNI TGTVLAS SGDDGCVRLWKANYMDNW XCTGI L 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
l^Leucine, M=Methionine, N=Asparagine , 
P-=Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Thraonine, V-Valine, 
{^Tryptophan, Y= Tyrosine, x=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6492 


34 


2573 


| KGNGSPVNGS5QQGTSNP3LGSNIPSLQNSLNGSSAGRKHS 
IPFLKSCCCCCIiFDFPPPPi,DQVQEEECEVERVrSHGTP3CPFRK 
PDSVAFGESQSEDEQPENDLETDPPNWQQLVSREVLLGLKPCEI 
KRQE VI NELF YTERAHVRTL KVLDQVFYQRVS REGI LS P SELRX 
IFSNLED2 LQLHIGLNEQMKAVRKRNETS VIDQ IGEDIiLTWFSG 
PGEEKLKHAAATPCSNQ PFALEMIKSRQKXDS RFQTFVQDAESN 
PLCRRLQIiKDIIPTQMQRLTKYPLLLDNIATYTEWPTBREKVKK 
AADHCRQILNYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEYPN 
VEBLRNLDLTKRKMIHBGPLVWKVNRDKTIDLYTLLLEDILVLL 
QKQDDRLVLRCHSKILASTADSKHTFS PVIKLSTVLVRQVATDN 
KALFVISMSDNGAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 
TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDLG 
LESTLISSKPQ3HSLSTSGKSSVRDLFVAERQFAKEQHTDGTLK 
EVGEDYQIAIPDSHTiPVSEERWALDALRNIiGLLKQLLVQQLGLT 
EKSVQEDWQHFPRYRTASQGPQTDSV1QNSENIKAYHSGEGHMP 
FRTGTGDIATC YSPRTSTES FAPRDS VGLAPQBS QASNI LVMDH 

mimtpemptmbpegglddsgehffdareahsdenpsegdgavnk 
eekdvnlrisgnyli ux3ydp vqesstdeevass ltlqpmtgx p 
avesthqqqhspqnthsdgaispftpeflvqqrwgameyscfei 

QSPSSCADSQSQIMSYIHKIEADIiEHUCKVEESYTILCQRLAGS 

altdkhsdks 


6493 


557 


1147 


tparmayqgsstsdcmsktldsasahfaasawsapvpsrseva 
keqntghnningwqpsgtsktlystnmai^ssspgisavqlvrt 

VGHTTTNHLI PALCTSS P QTLPMNNS CLTNAVHLNNVS WSP VN 
VHINTRTSAPSPTALKLATVAASMDRVPKVTPSSAISSIARENH 
EPERLGLNGIAETTVAMEVT 


6494 ■ 


2425 


1052 


AVAGGARP CSTP SS PHRRCRRHRPR P L P R PPAAI MS ASAVY VLD 
LKGKVL I CRNYRGDVDMS EVE H FMPI LMEKEEEGMLS P ILAHG G 
VRFKW I KHNNL YLVATSKKNAC VSLVFS FLYKWQ VFS EYFKEL 
SEES IRDNFVI I YELLDELMDFGYPQTTDS KI LQEY ITQEGHKL 
ETGAPRPPATVTNAVSWRSEG I KYRKNEVFLDVI BSVNLLVSAN 
GNVLRSEIVGSIKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VBLEDVKFHQCVRLSRFENDRTISFIPPDGEFELMSYRUNTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
D ADSPKFKTTVG S VKWVPENS EI VWS I KS FPGGKB YLMRAHFGL 
PS VEAEDKEGKPPISVKFEIPYFTTSGIQVRYIiKI IEKSGYQAL 
irn VK X x TQNGDYQLRTQ 


6495 


2425 


1052 


avaggarpcstpssphrrcrrhrprpLprppaaiksasavyvld 

LKGKVL I CRN YRGDVDMSEVEHFM P ILMEKEEEGKLS P I LAHGG 
VRFMWIXHNNLYLVATS KKNACVSL VFS FLYKWQVFS EYFKEL 
BEES IRDNFVIIYELIJDEL^FGYPQTTDSKIIjQEYITQEGHKL 
ETGAPRPPATVTNAVSWRSEG I KYRKNBVFLDVIESVNLLVSAN 
GNVLRSEIVGSIKMRVTLSGMPELRIjGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLSRFE^RTISFIPPIXJEFBLMSYRLNTHVK 
PI>I WIES VIEKHSHSRIE ymi kaksqfkrrstannve ihi pvpn 

dadspkfkttvgsvk^pensbiwsiksfpggkeylmrahfgl 

psveabdkegkppisvkfeipyfttsgiqvrylkiibksgyoal 
pwvryitqngdyqlrtq 


6496" 


247 


559 


LRAVSLLPLQLVLPEYS IHSLFCIMFLCAQEWLTLGLNVPtiLFY 

hfwryfhcpadsselaydppvvmnadtlsycqkbawcklapyll 

SFF YYL YCMI YTLVS S 


64 97 ; 


1053 


352 


ANTUICRLCPRRHI^PPCGAKMGNGTEES)YNFVFKVVLIGESGV~ 
GKTNLLSRFTRNEFSHDSRTTIGVEFSTRTVMLGTAAVKAQIWD 
TAGLER YRAI Tfi AYYRGAVGALLVFDLTKHQT YAWERWLKEIiY 
DHAEAT IVVMLVGNKSDLSQARBVPTEEARMFAENNGLLFLETS 
ALDS TNVBLAFETVLKEI FAKVS KQRQNSIRTNAITLGSAQAGC 
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Amino acid segment containing signal peptide 
<A»Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H=»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ToThreonine, V»Valine, 
W=*TryptOphan, Y=Tyrosine # X= Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








BPGPGEKRACCISL 


6438 


2636 


272 


SIALCPWGTHLAGPTTMRLSSLLALLRPALPLltGLSLGCSriSt 
LRVSWIGGEGEDPCVEAVGERGGPQNPDSRARLDQSDBDFKPRI 
VPYYRDPNKP YKKVLRTRY IQTELGSRERLLVAVLTS RATLSTL 
AVAVNRTVAHHFPRLLYPTGQRGARAPAGMQWSHGDBRPAWLM 
SETLRHLHTHFGADYDWPPIMQDDTYVQAPRIjAALAGHLS INQD 
L YLGRAEEF I GAGEQARYCHGG FG YLLSRSLLLRLRPHLDGCRG 
DILSARPDEWLGRCLIDS LGVGCVSQHQGQQ YRSFELAKNRDPE 
KEGSSAFLSAFAVHPVSEGTLMYRLHKRFSALBLERAYSEIBQL 
QAQIRNLTVLTPEGEAGLSWPVGLPAPFTFHSRFEVLGWDYFTE 
QHTFS CADGAPKCPLQGASRADVGDALETALEQLNRRYQPRLRP 
QKQRLLNGYRRFDPARGMEYTLDLliLECVTQRGHRRALARRVSL 
LRPLSRVE I LPM P YVTBATRVQLVLP LLVAE AAAAPAFLE APAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAELERRY 
PGTRIjAWIAVRAEAPSQVRLMDWSKKHPVDTLFPLTTVWTRPG 

pe vlnrcrmnai sgwqaffp vhfqeftfpals pqrsppgppgagp 
dp ps p pgad psrgapiggrfdrqas aegcfynady laararlag 
elagqeeeealeglevmdvflrfsglhlfravepglvqkfslrd 
cs prls eelyhrcrlsnleglggraqlamalfeqeqakst 


6499 


3 


2040 


SCS ADT RPSGQ AWPTVGLRAAAGAFRTGS PIiALGPET PQ VACLP 
GH PP VRPQVSGG PGAMPDPAAHLPF F YGS I S RAEAEEHLKLAGM 
ADCLFLLRQCLRSbGGYVLSLVHDVRFHHFP IERQLNGTYAIAG 
GKAHCGPAELCEFYSRDPDGLPCNLRKPCNRPSGLEPOPGVFDC 
LRDAMVRDYVRQTWKDEGEALEQAI I SQAPQ VEKL IATTAHERM 
PWYHSSLTREEAERKLYSGAQTDGKFLLRPRKEQGTYALSLIYG 
KTVYHYLISQDKAGKYCIPEGTKFDTLWQLVEYLKLKADGLIYC 
LKEACPNSSASNASGAAAPTLPAHPSTLTHPQRRIDTLNSDGYT 
PEPARlTSPDKPRPMPMDTSVYESPYSDPEELKDKXLFLKRDNIj 
LIADI ELGCX5NFGS VRQGVYRMRKKQ IDVAI KVLKQGTEKADTE 
EMMREAQ1MHQLDMPYIVRLI GVCQAEALMLVMEMAGGGPLHKF 
LVGKREE IPVSNVAELLHQVSMGMKYLEEKNFVHRDLAARNVLL 
VNRHYAKISDFGl^KALGADDSYYTARSAGIO^IiKWYAPECINF 
RKFSSRSDVWSYGVTMWEALSYGQKPYKKMKGPBVT4AFIEQGKR 
MECPPBCPPELYALMSDCWIYKWEDRPDFLTVEQRMRACYYSLA 
SKVEGPPGSTQKAEAACA 


6500 


1773 


726 


TGPTHASADAWGLVRSVTBVJCANVRGNP CAAALS C PQAVLDAGK 
MLS ES S S FLKGVMLGSI FCAL I TMLGHI R IGHGNRMHHH EHHHIj 
QAPNKEDI LKISEDERMELSKSFRVYCI I LVKPKDVSLWAAVKE 
TOTKKCDKAEFPSSENVKVFESINMIXrNDMWLMMRKAYKYAFDK 
YRDQYNWFFLARPTTFAI IENLKYFLLKKDPSQPFYLGHTI KSG 
DLE YVGMEGGIVLSV2SMKRLNS LLNI PEKCPEQGGMI WKI S ED 
XQLAVCLKYAGVFAENAEBADGKD VFNTKSVGLS I KEAMTYH PN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 


e'soi 


1 


570 


LVGMS GGGTETP VGCEAAPGGGS KKRDS LGTAGSAHL I IKD LGE 
■LtibKLi LdJtUi r VI QOhTR x FVKE FEEKRGIiREMR VJjENLKNM IHE 
TNEHTLP KCRDTMRDS LSQVLQRLQAANDSVCRLQQREQERKKI 
HSDHLVASEKQHMLQWDNFMKEQPNKRAEVDEEHRKAMERLKEQ 
YAEMEKDLAKFSTF 


6502 


213 


1650 


AGNKPDP WAGRNRTAVLPDVS VFHRED VGWWRSWLQQ Si'QAVKE ' 
KS SEALE FMKRDLTE FTQWQHDTACT I AATASWKE KLATEGS 
SGATEKMKKGLSDFLGVTSDTFAPSPDKTIDCDVITLMGTPSG^ 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGE ISE LLVGS PS I RALYTKM VP AAVSHS E FWHR YFYKVHQLEQ . 
EQARRDAL KQRAEQS I SEE PGWEEEEEELMG IS P I S PKEAKVP V 
AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLBASLEEQGLAVDVGETGPSPP 
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ID 

NO: 


Predicted 

lv>cri mina 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucieociae 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


lunmo acid segment containing signal peptide " 
(A=Alanine, CsCysteine, D«Aspartic Acid, E« 
Glutamic Acid, F=Pbenylalanine , G=Glycine, 
lUHistidine, I=>Isoleucine, K=Lysine, 
LaLeucine, K=Methionine, N=Asparagine , 
P*Proline, Q=Glutamine, R=Arginine, 
S»Serine, TaThreonine , V=Valine, 
W-Tryptophan, Y=Tyrosine, X-Unknown, *-stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IHSKPLTPAGHTGGPBPRPFARVETLREBAPTDLRVFELNSDS3" 
KSTPSNNG KKGSS TDI S ED WEKD FDUDMTEEEVQKAIiSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


65C3 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQ3YQAVKE 
KSSEALEFMKRDLTE PTQWQHDTACTI AATAS VVKE KLATEGS 
S GATE KM KKGLSDPLGVI SDTFAPS PDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGBISELLVGSPSIRALYTKMVPAAVSHSEFWHRYFyXVHQLEQ 
EQARRDALKQRAEQSISEEPGWEEEEEELMGISPISPKEAKVPV 
AKI STFPEGEPGPQSPCKENLVTS VEPPAEVTPSESSESIS LVT 
Q I ANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETG PSPP 
IHS KPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDI SEDWEKDFDLDMTBEEVQMALSKVDASG 
E VSGPGGSEG S E PNG PGCESS PQ P AQLSPQSGPCSCLR 


SS04 


2131 


1294 


G KVC_i VAHW VCLS I LS P PPAGM KT PNAQE AEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCRXSHGWKEGDEPITQ 
WKGTVLDQVPIN PSLYLVKYDU I DCVYGLELHRDERVLSLKILS 
DR VAS S H ISDANLANTI I GKAVEHMFEGEKGS KDEWRGMVLAQA 
PIMKAWFyiTYEKDPVLYMYQLLDDYXEGDLRIMPESSESPPTE 
REPGGWDGL IGKHVE YTKEDGSKR I GMVIHQ VEAKPS VYFI KF 
DDDFHIYVYDLVKKS 


' 650S 


2131 


1294 


GKV CLVAH WVCLS I LS P P P AGMKT PNAQEAEGQ QTRAAAGRATG '" 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCRISHGWKEGDEPITQ 
WKGTVLDQVP INPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DRVASSHISDANLANTI I GKAVEHMFEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPE6SBSPPTE 
RE PGGWDGLI G KHVEYTKEDGSKRIGMVIHQVEAKPfl VY F I KF 
DDDFHIYVYDLVKKS 




1 


1350 


fciVSP PTS CCLTVAVADPGVSEGFRGFGAGCEMPGRGRCPDCGST 
ELVEDSHYSQSQLVCSDCGCWXEGVLTTTFSDEGNLREVTYSR 
S TGENEQ VSR S QQRGLRRVRDL CRVLQL PPTFEDTAVAY YQQ AY 
RHSGIRAARLQKKEVLVGCCVLITCRQHNWPLTMGAICTLLYAD 
LDVPS STYMQI VKLLGLDVPSLCLAELVKTYCSS PKLFQASPSV 
PAKYVEDKEKMLSRTMQLVELANETWLVTGRHPLPVITAATFLA 
WQSLQPADRLS CSLARFCKLANVDLPYPASSRLQELLAVLLRMA 
EQLAMLRVLRLDKRS WKH I GDLLQHRQS LVRSAFRDG TAEVET 
REKEPPGWG0GQGEG3VGNNSLGLPQGKRPASPALLLPPCMLKS 
PKR1CPVPPVSTVTGDENISDSEIEQYLRTPQEVRDFQRAQAAR 
QAATSVPNPP 


6507 




929 


RS«ASRLPELPSGCLVLQVQELVQ^GMEATVTIPIMQNKPHGA"" 
ARSVVRRIGTNLPLKPCARASFETLPNISD^CLRDVPPVPTLAD 
IAWIAADEBETYARVRS DTR PLRHTWKPS PLI VMQRNASVPNLR 
GSEERLIiALKKPALPALSRTTELQDELSHLRSQIAKI VAADAAS 
ASLrPDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITEETBVE 
VPELPS VPLLCSASPECCKPEHKAACSS SEEDDCVSI>S VRQqpa 
DMMGILKDFHRMKQSQDLNRSLLKEEDPAVL I SEVLRRKFALKE 
EDISRKGN 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTPRKMAAARP 
S LGRVLPGSS VLFLCDMQEKFRHNIAYFPQ1VSVAARMLKNTTL 
DLLDRGLQ VHVWDACS SRSQVDRLVALARMRQSGAFLS TSEGL 
I LQLVGDAVHPQFKE I QKLI KE PAPDSGLLGLFQGQNS LLH 


6509 " 


2 


1053 


FVNNPRGGRKRRRQAAVTQAATRASGTPSPRDGTMTQGKLSVAN" 
KAPGTEGQQQVHGE KKEAPAVPS AP PSYEEATSGEGMKAGAFPP 
APT AVPLHP SNA YVDP SSS SSYDNG FPTGDHEL FTTFS WDDQKV 
RRVFVRKVYTILLlQLLVTLAVVALFTFCDPVKDYVQAflPGW^ 
ASYAVFFATYLT1ACCSGPRRHFPWNLILLWFTLSMAYLTGML 
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ID 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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1 Predicted end 
nucleotide 
location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
v^ma^uaub, wauysce-ne, D=Aspartic Acid, Be 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K-Lysine, 
Ii=Leucine, Methionine, N=Asparagine, 
P=Proline, QoGlutamine, R^Arginine, 
SoSerine, TVThreonine, V=Valine, 
W=Tryptophan, ^Tyrosine, X^Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 1 








oonwiig vjJuv.iA»x HUjVCJjS VTVFSFQTKFDFTSCQGVLFVI* 
LMTLFFS GL ILAILLP FQ YVP WLHAVYAALGAGVFTLFLALDTQ 
LLMGNRRHSLSPEEY I FGALNI YLDI I Y I FTPFLQLFGTNRE 


6510 


37 


1156 


P CALDGCPQRGAVH P LLSS AMG LliAFLKTQ FVLHLLVG F VF WS 
GLVINFVQLCTLALWPVSKQLYRRLNCRIAYSLWSQLVMLLEWW 
SCTBCTLFTDQATVERFGKBHAVIILNHNPEIDFLCGWTMCERF 
GVLGSSKVLAXKBLLYVPLIGWTWYFLEIVFCKRKWEEDRDTW 
EGLRRLSDYPEYMWPLLYCBGTRFTBTKHRVSMEVAAAKGLPVL 
KYHLLPRTKGFTTAVKCLRGTVAAVYDVTLNFRGNKNPSLLG1L 
YGKKYEADMCVRRFPLEDI PLDEKEAAQWLHKLYQEKDALQEIY 
NQKGMFPGBQPKPAREPWTLLNFLSWATILLSPLFSPVLGVPAS 
GS PLLILTFLGF VGAGNGHCR 


6511 
6512 


2541 


1425 


GEEQ PIAAAPTECLEQVIGGAGDPGTWAS FPS PLPGPAPLKGGK 
TMATNFSDIVKQGYVKMKSRKLGIYRRCWLVFRKSSSKGPQRIiE 
KYPD2KSVCLRGCPKVTEISNVKCVTRLPKETKRQAVAII FTDD 
5 SARTFTCDSEliEAEEWYKTLSVE CLGSRLND I SLGE PDLLAPGV 
QCEQTDR FNVFLLPCPNLDVYGB CKLQI THENI YLWDIHNPRVK 
LVSWPI^SI^RYGRDATRFTFEAGRMCDAGEGLYTFQTQEGEQI 
YQRVHSATLAI AEQEKRVLLEM E KNVRLLNKGTEHYS YP CTPTT 
MLPRSAYWHH1 TGSQN X AEAS S YAGEGYGAAQAS SETDLLNRF I 
LLKPKPSQG0SSEAKTPSQ 




159 


807 


FGKKSTWFPLSRSI.RVASGRSCKLGHGGYTGSGPGFGEPRDSGA 
EVPSGSGRATGCERGGVRGARQGRAPGSSIWRKEPRMVCTRKTK 
TLVSTCVILSGMTNI I CLLYVGWVTNYtASVYVRGQEPAPDKKL 
EEDKGDTLKI IERLDHLENVI KQHIQEAPAKPEEAEABPFTDSS 
LFAHWGQELSPEGRRVALKQFQYYGYNAYLSDRLPLDRP 


6513 


2 


756 


FVS PE PGFS LAOLNIi I VJQLTDTKQItVHSFAEGQDQGSAYANRTA 
LFPDLLAQGNASIiRLQRVRVADEGS FTCFVS I RDFGSAAVSLQV 
AAPYS KPSMTLE?NKDLRPGDTVT I TCSS YQGYPEAEVFWQDGQ 
GVPLTGNVTTS QMANEQGL FDVHS ILRWLGANGT YSCLVRNPV 
LQQDAHSSVTITPQRSPTGAVEVQVPEDPWALVGTDATLRCSF 
SPEPGFSLAQLNLIWQLTDTKQLVHSFAEGQDQGSAYANRTALF 
PDLLAQGNASLRLQRVRVADE3S FTCFVS IRD FGSAAVSLQ VAA 
PYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQGV 
PLTGNVTTSQMANEQGL FDVHS I LR WLGANGTYS CLVRN PVLQ 
QDAHS S VTITPQRSPTGAVE VQVPEDPWALVGTDATLRCS FS P 
EPGFS liAQLNL I WQLTDTRQLVHSFTEGR 


6514 
6515 " 


985 


302 


VGIPGPTISSAAEMEDLLDl>3EELRYSiATSRAKMGRRAQQESA 
QAENHLNGKNSSLTLTGETSSAKLPRCRQGGWAGDSVIO^SKFRR 
tv ™ JC "* *• ivuiv*'^ &i_u\il»o u i i3t*L>l Fix PDLEE VQEEDFVZiQVA 
APPS I QIXRVMTYRDLDNDLMKYSAIQTLDGEI DLKLLTKVLAP 

EHEVRERNPSWQDDVGWDWDHLFTEVSSEVLTBWDPI.QTEKEDP 
AGQARHT 




1345 


30* 


GRVGSRRRGAAVPGGCGAGSTQLBVSASASCGALGS ADMNP 1 VV 
VHGGGAGPI SKDRKERVHQGIWRAATVGYGILREGGSAVDAVEG 
A WAL EDD PEPN AGCGSVLNTMGEVEMDAS IMDGKDLS AG AVS A 
VQ C IANP I KIAR LVMEKTPHCFLTDQGAAQFAAAMGVPEI PGEK 
L VTERNKKRLE KE KHE KG AQKTD CQKNLGTVGA VALD CKGNVA Y 
AT3TGGIVNKMVGRVGDSPCLGAGGYADN0IGAVSTTGHGESIL 
KVNLARLTLFHIEQGKTVEEAADLSLGYMKSRVKGLGGLIVVSK 
TGDWVAKWT5TSMPWAAAKDGKLHFG IDPDDTTITDLP 


6516 


1 


1402 


FRRIiRYLGQDATAAARDLRTRGUSGYCPSAtAROQVLVSALQQL 
KGRRSEHRNENQEMPYSTNKELILG1MVGTAGISLLLLWYHKVR 
KPGI AMKLPE FLS LGNTFNS I TLQDE I HDDQGTTV I FQERQLQ I 
LEKLNELLTNMEELK^EIRFLKEAJPKLEEYIQDELGGKITVHK 
ISPQHRARXRRLPTIQSSATSNSSEEAESEGGYITANTDTEEQS 
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Amino acid segment containing signal peptide""" 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine H=Asnaraoino 
P=Proline, Q»Glutamine, R=Arginine, 
S»Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknovm, *=stop 
Codon, /-possible nucleotide deletion, 
\ ^possible nucleotide insertion) 


651T 






t'FVPKAFNTRVEEl.NLDVLLUKVDHLRMSESGKSESFBLLRDHK 

EXFRDE IEFMWRFARAYGDMYELSTNTQEKKHYAN IGKTLSERA 

INRA PMNGHCHLWYAVLOG YVSEFEGLQN KINYGHLFKEHLDI A 

IKLLPEEPPLYYLKGRYCYTVSKLSWIEKXMAATLFGKIFSSTV 

QEALKNFLKAEELCPGYSNPNYMYLAKCYTDLEENQNALKFCNL 
1 ALLLPT VTK EDKEAOK PTvrn Br tmtct wo 


65ia 


3 


1414 


qrvwggssslnamvyvrghaedybrWqrqgargwdyahclpyfr 

KAQGHELGASRYRGADGP LR VS RGKTNHPLHCA FLEATQQAG YP 
LTEDMNGFQQEGFGWMDMT IHEGKRWSAACAYLHPALSRTNLKA 

eaetlvsrvlfegtravgveyvkngqshrayaskevilsggain 
spqllmlsg ignaddlicklgi pwchlpgvgqnlqdhlei yiqq 
actrpitlhsaqkplrxvciglbwlwkftcegatahletggfir 

au^vf h P£>i QFHFLPS 0 VIDHGRVPTQQEAYQVHVGPMRGTS V 

gwliclrsawpqdhpviqpnylstetdiedfrlcvkltrbifaqe 

ALAP FRG KELQ PGS H I QSDKE IDAFVRAKADSAYH PS CTCKMGQ 
PSDPTAWDPQTRVLGVENLRWDASIMPSMVSGNLNAPTIMIA 
EKAADI I KGQ PALWDKD VPV Y KPRTLATQR 




242 


1098 


MAWNPGSEPKTRVRPRARSFPI.PPPRAPRRRRHRLLRAVPGPSR 
RHRCRRRAPP PPSTMGDAGS BRSKAPSLPPRCPCGFWGSSKTKN 
LCS KCFAD FQXKQPDDDS AP STSNSQSDLFSE BTTS DNNNTS I T 
TPTLSPSQQPLPTELNVTSPSKEECGPCTDTAHVSL1TPTKRSC 
vi i L/oyocjM iSAif VKKPRIiLENTERS EETSRSKQKSRRRCFQOQT 

[ KLELV(XJBLGSCRCGYVFCMI«RLPEQHDCTFDHMGRGREEAIM 
KM VKLDRKVGRS CQR 1 GEGCS 


6S19 
6520 


3 


1113 


iiKKMAEPPSPVHCVAAAAPTATVSEKEPFGKI^LSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEG GSGGNSRQLQPPAA PS PQS YGS PAS 

WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VP P TLLHAQ P HHLKL PAAAAAAS ANAJCSRR P KEKR E KE RRRHGL 

GGAREAGGASREENGEVKPLPRDKIKDKIKBRDKEKERBKKKHK 
VMNEIKKENGEVKILLKSGKEKPKTNI EDLQI KKVKXKKKKKHK 
BNEXRKRPKMYSK'QTO'PTr'Crrr.T TnuDnn»n\wr.T r xm»t»« n « 

KNLDTKN YDS KI PENSEFPFVSLKEPRVQNNLKRLDTLBFKQLI 
HIEffOPNGGASVIHCXQ 


6521 " 


3 


ITTT f 


aKKMAEPPSFVHCVAAAAPTATVSEKEPFGKIiQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
NSFAPLSAAPSPSSSRSSPSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKBKREKERRRHGL 
GGAREAGGASRE BNGEVKPL PRDKI KDK I KERDKEKEREKKKHK 
VMNEIKKENGEVKIIiIJCSGKEKPKTNIEDLQI^KVKKKKJQCKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVBDQAAKGILNDNIKDYVG 
KNLDTKNYDSKI PENS EFPFVSLKEPRVQNNLKRIiDTLBFKQLI 
H I EHQ PNGGAS V IHCLQ 




184 


1798 


JUi^ATDTSQGELVXIPKALPLlVGAQLIHAUiaGEKVSDSTMP ~ 
I RRTVNSTRETP P KS KLAEGEEEKPB PD I SSEES VSTVEEQENE 

TPPATSSEAEQPKGEPENEEKEENKSSEETKKDBKDQSKEKEKK 
VKKTI P S W ATLSAS Q liARAQKQTPMASSPRPKMDAILTEAIKAC 
FQKSGASVVAIRKYI IHKYPSLELERRGYLliKQALKRELNRGVl 
KQVXGKGASGSFVWQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPLAFTRLCEP KEAS YS L IRKYVSQYYPKL RVDI RPQLLKNA 
LQRAVERGQLEQI TGKGASGTFQLKKSGEKPLLGGSLME YAILS 
AIAAMNEPKTCSTTAI>KKYVLENHPGTNSNYQMHLLKKTLQKCE 
KNGWMBQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDE 


6522 


1042 J 


. 

1 

| I 

391 j 


3BDESSEEDSEDEEPP PKRRLQJCKTPAXSPGKAASVKQRGSKPA 

?KVSAAQRGKARPLPKKAPPKAXTPAKKTRPSSTVIKKPSGGSS 
OCPATSARKJB 

fKWLRPSPRSHRTPESGRVLSLFRLPPPGHALSGSTPAPCWEED 



516 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti"ae"~ 
(A»Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H^Histadine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, H=*Asparagine, 
P=Proline, Q^Glut amine, R=Arginine, 
S«Serine, TeThreonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X»Unknovn, *=stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








BCLDYyGMLSLHRMFEWGGQLTECELEIjtiAFIiliDEAPGAAGGli " 
SRARSGLKbLLELERRGQCDESNLRLLGQLLRVLARHDLLPKLA 
RKRRRPVS PERYS YGTSSSS KRTEGS CRRRRQS SSS ANSQQGSP 
PTKRQRRS RGRPSGG ARRRR RG PQPHPSSS QSPPDL PLKAK 


6523 


2 


1097 


ASCQTRRRTAALDSGERIAGRRSPIAIAMASNPNDIVKQ3YVKI "* 
RSRKLG1FRRCWLVFKKASSKGPRRLBKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAI IFHDETSKTFACBSELEAEEWC 
KHLCMECLGTRIWISLGEPDLLAAGVQREQNERFimTiMPTPN 
LDIYGECTMQ ITHENI YLWDIRTJAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLS MEQ KARl iQTSLTEPMTLSKS I S LPRS AYWHHITRQNSVGB 
I YS LQGNHENRHS DLTGKSCKTSENRFLEENAPLVMYG I THHL F 
MDTS TCKWHDLE 


" *524 - 


2 


1097 


ascqtrrrtaaldsgeriagrrspialamasnfndivkqgyvkT" 

RSRKLGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRKFHKVT 
ELHNIKNITRLPRETKKHAVAI I FHDETSKTFACESEIiEAEEWC 
KHLCMECLGTRIiND IS LGEPDLLAAGVQHEQNERFNVYLMPTPN 
LD I YGECTMQI TH EN I YLWDI HNAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFG/rREGEMIYQKVHSATLAIAEQHER 
LMLEMBQXARLQTSLTE PMTLSKS IS LPRSAYWHHITRQKTSVGE 

IYSLQGNHENRHSDiTGKSCKTSENRFLBENAPLVMYGITHHLP 
MDTSTCKWHDLE 


6525 


1 


1859 


GESPFSEEESIEFNPSSSGRSARTVSSNSFCSDDTGWPSSQSVS " 

PVKTPSDAGNS PIGFCPGSDEGFTRKKCTIGMVGEGS IQSSRYK 

KESKSGLVKPGSEADFSSSSSTGSISAPEVHMSTAGSKRSSSSR 

NRGPKGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 

SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGBNHGVRPPNP 

EQYLTPLO^KEOTVRHLKTKLKESERRLHERESEIVELKSQLAR 

MR E D W IEEE CHRVEAQ LAIiKEARKB I KQLKQVIETMRSSLADKD 

KGI QKYFVD IK IQNKKLES LLQSMEMAHSGS LRDBLCLD FPCDS 

PEKSLTLNPPLDTMADGLS LEEQVTGEGADRELLVGDS I ANSTD 

LFDEIVTATTTESGDLELVHSTPGANVLELLPIVKGQEEGSWV 

ERAVQTDWPYSPAISELIQSVLQKLQDPCPSSLASPDESEPDS 

MES FPESLSAL WDLTPRNPNSA ILLS P VETP YANVDAEVHANR 

LMRELD FAACVBERLDGVI PLARGGVVRQYWSSSFLVDLLAVAA 

P\A/PTVLWAFSTQRGGTDPVYNIGALLRGCCVVALHSLRRTAFR 

IKT 


6526 


2 


2034 


SGRAGBPEEWRGRQIIDSKETWIPFNSBDSQQLEEAysSGKGCN 
GRWPTDGGRYDVHLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVP YSESFS QVLEETYMLAVTLDEWKKKLBSPNRE II ILHNP 
KLMVHYQPVAGSDDWGSTPMEQGRPRTVKRGVENISVDIHCGEP 
LQ I DHLVFWHG I GPACDLR FRS I VQ CVNDFRSVS LNLLQTHFK 
KAQENQ QI GRVE FLPVN WHS PLHSTGVDVDLQRITLPSINRLRH 
FTNDT ILDVFFYNS PT YCQTI VDTVASEMNR I YTLFLQRNPDFK 
GGVSIAGHSLGSLILFDILTNQKDSLGDIDSBKGSLNIVMDQGD 
i tr i lid EiULt]\iMAjij& a e * u x r ttKEKVDKEALALCTDRDLQEIGIP 
1/jPRKKILNYFSTRKNSMGIKRPAPQPASGaNIPKESEFCSSSN 

trngd yld vgigqvs vkyprl i ykpe i ffafgsp igmfltvrgl 
kridpwyrfptckgffniyhpfdpvayriepmvvpgvefepmli 

PHHKGllKRMHLELREGLTRMSMDLKimiLGSLRMAWKSFTRAPY 

palqasetpeetbaepestsekpsdvntbetsvavkee vlp i nv 
gmlnggqridyvlqekpiesfneylfalqshlcywesbdtvllv 

LKEIYQTQGIFLDQPLO 


6527 


1 


922 


gwvpllsrilpsdacklykqginirldttlidftdmk'cqradl^ 
fifngdaapsbsfwldnbqkvyqrihheesemkteeevdilms 
sdiysatlstks is ftraqtgwlfredktervgnfladfylvng 
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Amino acid segment containing signal peptide 
(A=*Alanine, C»Cysteine, D=Aspartic Acid, B- 
<*j.ucamic Acad, F= Phenylalanine, G=Glycine, 
H*»Histidine, I«Isoleuoine, K-Lyeine, 
L=Leucine, M-Methionine, N=Asparagine, 
P* Proline, Q^Glut amine, R=Arginine, 
S^Serine, T= Threonine, VsValine, 
W=Tryptophan, Y*Tyrosine, X=Unknown. *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








lvlesrkrrehlseedilrnkaimeslskggnimeqnfbpirrq™ 

SLTPPPQNTI TW3EY I SAENGKAPHLGRELVCKES KKTFKATI A 
MSQEFPLGIELLtJWLEVVAPFKHFNKIiREFVQMKLPPGFPVKIi 
DI PVF PTITATVTFQE FRYDE FDG S I FTI PDD YKEDPSRFPDL 


6528 


1 


1073 


IjTGPAAAE PRCAADAGMKRALGRRKG VWLRLR KI LFCVtfGl7?lA j 
I P FL I KLCPG I QAKLI PLNPVRVPY FI DLKKPQDQGLNHTCNY Y 
LQPEEDVTIGVWHTVPAVWWKNAQGKDQMWYEDALASSHPIILY | 
LHGNAGTRGODHRVELYKVLSSLGYHWTFDYRGWGDSVGTPSE 
RGMTYDALHVFDWI KARSGDNPVYIWGHSLGTGVATNLVRRLCB 
RETPPnALILESPFTNIREEAKSHPFSVIYRYFPGFDWFFLDPI 
TSSG I KFANDENVKH I SCPLLILHAEDD P VVPFQLGRKL YS I AA 
PARS FRDFKVQFVPFHSDLG YRHK Y I YKS PELPR I LREFLGK3 E 
PEKQH 




363 * " 


2215 


THIRYNKIGWKTMSCGNE FVETLXKIG YPKADNLNGEDFDWLF 
EGVBDESFLKWFCGNVNBQNVLSERELEAFSILQKSGKPILEGA 
AI*DEALKrCKTSDLKTPRLDDKELEKLEDEVQTI>I*KLKNljKlQR 
RNKCQLMASVTSHKSLRLNAKBEEATKKIJCQSQGILNAMITKIS 
NELQ ALTDEVTQLMMFFRHSNLGQGTNPLVFLSQ F SLEKYLS QE 
EQS TAALTL YTKKQFFQG IHE WESSNESQFFNFL KIQTP S I CD 
NQE I LEERRLEMARLQLAY I CAQHQL I HLKASNS SMKSS I KNAE 
ESLHSLTSKAVDKENLDAKISSLTSEIMKLEKEVTQIICDRSLPA 
WRENAQ LLKM PWKGDFDLQI AKQDYYTARQEL VLNQLI KQKA 
SFBLLQLS YE3 ELRKHRDI YRQLENLVQELSQSNMMLYKQLEML 
TDPSVSQQINPRNTIDTKDYSTHRLYQVBEGENKKKELFLTHGN 
LEEVAEKLKQNISliVQDQLAVSAQBHS FFLSKRNXDVDMLCDTL 
YQGGNQLUjSDQELTEQFHKVESQLNKLNHLLTDILADVKTKRK 
XLANNKLHQMBREFYVYFLKDEDYLKDIV3NLETQSPCI KAVSL5 
D 


6530 " 


128 


2986 


GAAHHGAIVQVHPLLPGSSTIMIHDLCLVFPAPAKAWYVSDIQ "" 
BLYIRVVDKVEIGKTVKAYVRVLDLHKKPFLAKYFPFMDLKLRA 
ASP 1 1 TLVALDE ALDNYTI TFLI RGVAIGQTSLTAS VTNKAGQR 
INSAPQQIEVFPPFRLMPRKVTLLIGATMQVTSEGGPQPQSNIL 
FS I SNES VAI/VS AAG LVQGLA I GNGT VS GLVQAVD AETGKW 1 1 
SQDLVQVEVLLLRAVRIRAPrMRMRTGTQMPIYVTGITNHQNPF 
S FGNAVPGLTFHMS VTKRD V1.DLRGRHHEAS IRIiPSQYSJFAMNV 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDEIQVQVFBKLQ 
LLNPEI EAEQI LMS PNS Y I KLQTNRDGAAS LS YRVLDGPE KVPV 
VHVDEKGFLASGSMIGTSTIEVIAQEPFGAKQTI I VAVKVC PVS 
YLRVSMS P VLHTQNKEALVAVPLGKT VTFTVHFHDNS GDVFHAH 
SS VLNFATNRDDF VQ I GKGPTKNTCWRTVS VGLTLLRVWDAKH 
PGLSDFMPLPVLQAIS PELSGAMWGDVLCLATVLTSLEGLSGT 
WSSSANS I LHI DPKTG VAVARAVGS VTVY YE VAGHLRTYKEVW 
SVPQRIMARHLHPIQTSFQBATASKVIVAVGDRSSNLRGECTPT 
W«tivau>UjHFialLiJ.iuybQrKPAVFEFPSQDVir i vEPQFDTALG 
QYFCSITMHRLTDKQRKHLSMKKTALWSASLSSSHFSTEQVGA 
EVPFS PGL FADQAE I IiLSNHYTSSE IRVFGAPE VLENLB VKSGS 
PAVLAFAKEKSFGWPSFITYTVGVLDPAAGSQGPLSTTLTFSSP 
VTNQAI AI PVTVAFVVDRRGPGPYGASLFQHFLDSYQVMFFT&F 
ALLAGTAVM I IAYHTVCTPRDLAVPAALT PRASPGHS PH YFAAS 
SPTSPNALP PARKAS PPSGLWS PAYASH 


6S31 


84* — " 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS 
S LCMV I TI Y YD VKVR FX VRG CG Q Y I S YR CQE KRNTYFAEYWYQA 
QCOQYDYCNSWSSPQLQSSLPEPHDRPLALPLSDSQIQWFYQAL 
NLSLPLPNFHAGTEPDGLDPMVTLSLNLGLS FAELRRMYLFLNS 
SGLLVLPQAGLLTPHPS 


6532 


2 


954 


AAGPPSEWNQDSLFPBPEPGPAPQVLLGPQGPGLIK6VAPPTL 
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to first 

amino acid 
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Amino acid segment containing signal peptide 
(Alanine, C=Cysteine, D-Aspartic Acid, Eo 
Glutamic Acid, P- Phenylalanine, G=Glycine, 
H^Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q^Glutamine, R^Arginine, 
S=Serine, T=Threonine, Va Valine, 
"^Tryptophan, Y=Tyrosine, X-Unknown, *oStop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








I TDSTGTHLVLTVTN KNAHS PGLS RGS PQQPS S Q PGS PAPAPSA 
QMDLEHPLQPLFGTPTSLLKKEPPGYEEAMSQQPKQQENGSSSQ 
QMDDLFDILIQSG3ISADFKEPPSLPGKBKPS PKTVCWSPIAAQ 
PSPSAELPQAAPPPPGSPSLPGRLEDFLESSTGLPLLTSGHDGP 
EPLSLIDDLHSQMLSSTAILDHPPSPMDTSELHFVPBPSSTMGL 
DLADGHLDSMDWLELSSGGPVLSLAPLSTTAPSLPSTDPLDGHD 
UQLBVJDSCh 


6533 


1798 


373 


STISWLARVEPPRRSSGVGAAiU.RFPGGSP^LPJ^CVLAliAVIi' 
ALLERXJNADSMSAHSMLCERIAIAKELIKRAESLSRSRKGGIEG 
GAKLCS KLKAELKFLQ KVE AG KVA1 XE SHLQS TNLTH LRAI VE S 
AENLEEWSVLHVFGyTDTLGEKQTLWDWANGGHTMVKAIGR 
KAEALHNI WLGRGQYGDKS XI EQAEDFLQASHQQPVQYSNPHI I 
FAFYNS VSSPMAEKLKEMGI SVRGDI VAVNALLDHPEELQPS E5 
ESDDEGPELLQVTRVDRENI LASVAF PTEI KVDVCKRVNLD I TT 
LITYVSALSYGGCHFIFKEKVLTEQAEQERKEQVLPQLEAFMKD 
KELFACESAVKDFQS ILDTIiGGPGBRERATVLI KR INVVPDQPS 
ERALRLVASSKINSRSLTI FGTGDTLKAITMTANSGFVRAANWQ 
GVKFSVFIHQPRALTESKEALATPLPKDYTTDSBH 


6534 


47 


596- ■" 


KATRF ISAAFWLNKQGVS P AXLPHTS WS WSLQTLS FLFS GDLA 
EKSLQCFPCSAMLLELIPLLGIHFVLRTARAQSVTQPDIHITVS 
EGASLELRCNYSYGATPYLPWMERTVEEAFILLVCLJCPWRVASS 
LEKKEKEDESFQLLLG3RYNVI*KAHCLI*PLIRWLTSGDSLLSA0 
PHCPQGL 


6535 


250 


964 


LIKTFfKDVAIQRDLLPKBKNljETIjLTliA FLE I DKAFSSHARLS 
ADATLLTSGTTATVALLRDG I EL WAS VGDSRAILCRKGKPMKI» 
TIDHTPBRKDEKERIKKCGGFVAWMSLGQPHVNGRLAMTRS IGD 
LDLKTSG VIAEPE TKR IKLHHADDSFLVLTTDG INFM VNSQB I W 
DFVNQCHDPNEAAHAVTEQAI QYGTEDNSTAWVPFGAWGKYKN 
SEINFS FSRS FASSGRWA 


6536 


242 


1174 


S LVKEMTN QYG I LFKQEQAHDDA r WS VAWGTNKKENSETVVTGS " 
LDDLVKVWKWRDERI»DLQWS L EGHQLGWS VD I SHTL PI AAS S S 
LDAHI RLWDLENGKQI KSI DAGP VDAWTIAFS PDSQ YLATGTHV 
GKVNI FGVESGKKE YS LDTRGKFILS IAYS PDGKYLASGAIDG I 
INIFDIATGKLLHTLEGHAMPIRSLTFSPDSQLLVTASDDGYIK 
I YDVQHANLAGTLS GHASW VLN VAFCPDDTHFVSS SS DK3 VKVW 
DVGTRTCVHTF F0HQDQVWG VKYNGNGSKI VS VGDDQE IH I YDC 
PI 


6537 


1638 


921 


KR FNPPPTQG PDP SLVYRPDVDP E VAKDKAS FRNYTS GPLLDRV 
FTTYK1>MHTHQTVDFV]^KHAQFGGFSYKK^ 

DESDPDVDFPNSFHAFQTAEGIRKAHPDKDWFHLVGLLHDLGKV 

LALFGEPOWAWGDT P PVRro ona CMnroprvo TPnmrnnY Annmr 
**r-»j-ii. vjcrynnv vuisx jt r vvi^KrUnov Vr UJfif 1 irUUJNIrDXiQDPRY 

STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVPAGDTliSPQSTCTR 


6536 
6539 " 


3345 
216 


2412 
339 


PYLYDFLDALITCQTAPEEiAFIKLDGLAGMLTEQLRRLTKQVQE 
*\j%.nivr(.uuiLM±&t\Jiv£lti, z SJalmaj/L x XP VLMAQAKIYWNLENYPM V 

EKIFRKSVEFCNDHDVWKLNVAHVLFMQENKYKEAIGFYEPIVK 
KHYDNILNVSAIVLANLCVSYIMTSQNEKAEELMRKIEKEEEQL 
S YDD PNRKM YHLCI VNL VI GTL YCAKGNYB FGIS RVI KS LEP YN 
KiaGTDTWYYAKRCFLSIiLENMSKHMIVIHDSVIQECVQFLGHC 
ELYGTN I PAVI EQPL EEERMHVGICMTVTDBSRQLKAL I YE I IGW 
NK 


6540 


3 


391 


FLGAASPHPHFSSLAPHPDQPEFTPVQDELEAMELWGPGV ) 
LERLWLLLLRRPEDAMAEC PTLG EAVTOHPDRLWAWEKFVYLDE I 
KQHAWLPLT IEIKDRLQLRVLLRREDWLGRPMTPTQIGPSLLP 
IMWQLYPDGRYRSSDSSFWRLVYH I KIDGVBDMLLELLPDD j 



519 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=»Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L«Leucine, MoMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
^-Tryptophan, ^Tyrosine, XaUnlcnown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6541 


11SS 


536 


RTLVQRR I LM LLR K PARGRDLRGRGRGT PRGGRKGLLPTP DE F P 
RFEGGRKPDSWDGNREPGPGHEHFRUTPRPDHPPHDGHSPASRB 
RSSSLQGMDMASLPPRKRPWHDGPGTSBHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSNMNSGPPRRGASRGGGRGR 


6542 


3 


3775 


SWPRGRGETGGHPGALRTRTMQKSVRYNEGHALYLAFIiARKEGT 
KRGFLS KKTABASR WHEKWPAL YQNVLF YFEGEQS CRPAGM YLI# 
EGCSCERTPAPPRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHQASYADIL1BREVLMQKYIHLVQIVBT 
EKIAANQLRHQLBDQDTBIERLKSEIIAI^KTKERMRPYQSNQE 
DEDPD I KK I KKVOS FMRGWLCRRKWXTIVQD YI CS PHAESMR KR 
NQIVPTMVEAESBYVHQLYILVNGFLRPLRMAASSKKPPISHDD 
VSS IFLNSETlMFLHEIFHQGLKARIANWPTIilLADLFDILLPM 
LNI YQBFVRNHQ YS LQ VLANCKQNRDFDKLL KQYEANPACEGRM 
LETFLTYPMFQ I PRYI 1TLHELLAHTPHEHVERKSLEFAKSKLE 
ELS R VMHDEVS DT EN I R KNLAI ERM IVEGCDI LiLDTS QTFI RQG 
SLI Q VPSVE RG KLS KVRLGSLSLKKBGERQC FLFTKH FL1 CTRS 
SGGKLHLLKTGGVI*SI*I DCTLIEE PDASDDDSKGSGQVFGHLDF 
KIWEPPDRAAFTVVLLAPSRQEK71AWMSDISQCVDNIRCNGLM 
TIVFEENSKVTVPHMIKSDARLHKDDTDICFSKTLKSCKVPQIR 
YASVERLLERLTDLRFLSIDFLNTFLHTYRIFTTAAVVIjGKIjSD 
I YKRPFTS I P VRSI#ELFFATSQNNRGEHLVDGK5PRLCRKFSSP 
PPIAVSRTS S PVRARKIhSLTSPLNS KIGALDLTTS SS PTTTTQS 
PAAS P PPHTGQ I PLDLS RGLSS PEQS PGTVEENVDN PRVDLCNK 
LKRS I QKAVLESAPADRAGVESS PAADTTELS P CRSPST PRHLR 
YRQPGGQTADNAHCS VSPASAFAIATAAAGHGS PPGFNNTERTC 
DKEFI IRRTATNRVLNVLRHWVSKHAQDFELNNELKMNVLNLLE 
EVLRDPDLLPQERKAAANILMALSQDDQDDIHIiKLEDI IQMTDC 
MKAECFESL SAMELAEQ I TLLDHV IFR3 IPYEE FLGQG WMKIiDK 

AD I CRCLHN YNGVLE ITS ALNRSAI YRLKKTWAKVS KQTKALMD 
KLQKT VS S EGRFKNI*RETLKNCN PPAVPYLGMYLTDLAFI EEGT 
PN FTE EG LVKFS KMRMI SH I IREIRQFQQTS YR I DHQPKVAQ YL 
LDKDIil I DEDTLYELSLKI EPRLPA 


6543 


1857 


950 


FVSGCGRAGIGLSWAMAABARVSRWYFGGi.R^rfiAAr'PTWPT r\T 
LKVHLQTQ QE VKLRMTGMALRWRTDG I IALYSGLS AS LCRQMT 
YSLTR FAI YETVRDRVAKGSQGPLPFHEXVLLGSVSGLAGG FVG 
TPADI>VNVRMQNDVKLPQGQRRNYAHALDGI/YRYAREEGLRRLF 
SGATMASSRGALVTVGQLSCYDQAKQLVLSTGYLSDNIFTHFVA 
SFI AGG CATFL CQ PLDVLKTRLMNS KGEYQG VFHCAVETAKI*GP 
LAFYKGLVPAG I RL IPHTVLTFVFLBOLRKNFGI KVPS 


6544 


630 


79 


PSPCFIRSRLDGQPWMAGLEAWLSQNFSLHQPQSRVRVRRASIS 
EPSDTDPEPRTLNPSPAGWFVQQHPELELMSSFRERFGRNWLQY 
RSHLEPSGNPLPATPTTSAP3APPASSQGPDTAPRPSPPQEEAR 
GPQESPQKMSEEVRAEPQEEEEEKEGKEEKEEGBMAPLPEAHLG 
EGKQKECP 


6545 


176 


560 


P PHSHAAI>L P AAMTPLLTLI L WLMGLPLAQALDCHVCAYNGDN "" 
CFNPMRCPAMVAYCMTTRTYYTPTRMK7SKSCVPRCFETVYDGY 
S KHASTTS CCQ YDLCNGTGLATPATIiAIiAP I LLATL WGL L 


6546 


1657 


364 


HLLNGLDEVAAFFVADLGAIVRKH FCFLKCIiPRVRPFYAVKCNS 
S PGVLKVLAQLGLGFSCANKAEMELVQHIGI PASKI I CANPCKQ 
IAQIKYAAKHGIQLLSFDNEMELAKWKSHPSAKMVLCIATDDS 
HSLSCLSLKFGVSLKSCRHLLENAKKHHVEWGVSFHIGSGCPD 
PQAYAQS I ADARLVFEMGTBLGHKMHVLDLGGGFPGTEGAKVRP 
EEIASVINSALDLYFPEGCGVDIFAELGRYYVTSAFTVAVSIIA 
KKEVLLDQ PGREEENGSTS KTI VYHLDEG VYG I FN3VL FDNTCP 



520 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acm segment containing signal peptide 
(A -Alanine, c« Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P* Proline, Q=Glut amine, R«Arginine, 
S=Serine, ^Threonine, V-Valine, 
W=Tryptcphan, Y»Tyrosine, X=Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








tpilqkkpsteqplyssslwgpavdgcdcvaeglwlpqlhVgdW" 

L VFDNMG AYTVGMGS PF WGTQACH I T YAMS RVAWEALRRQLMAA 
EQEDDVEGVCKPLS CG WEI TDTLC VGP VFTPASIM 


6547 


1 


541 


LHSKYLAPALCSQPGMMRCCRRRCCCRQPPHALRPLLIiIiPLVLIj 
PPIAAAAAGPNRCDTIYQGFAECLIRLGDSMGRGGELETICRSW 
NDFHACAS QVLSGCPE EAAAVWESLQQEARQAPRPNNLHTLCGA 
P VHVRERGTGS ETNQETliRATAPAIi PMAPAPPLLAAALALAYLL 
RPLA 


654B 
~~ 6543 


2 


219 


fvsrlsvrdvrfptflgghgadamhtdpdysaayVpietdaedg 

IKGCGITFTLGKGTEVGELKH.SRFQNA 




73 


1490 


ktgrvcedarpacgsrsrrrrkeaapgiptpspssssptssrpa" 

ARAFSKAPARLSRPRJtftEEPPDPGRRYIQBEIIQARKHKLIKMC 
SSVAAKLWFLTDRRIREDYPQKBILRALKAKCCEEELDFRAWM 
DEV^/LTI EQGNLGLRINGEL ITAYPQWWRVPTPWVQSDSDIT 
VLRHLEKMG CRLMNR PQAILNC VNKFWTFQE LAGHGVPL P DT FS 
YGGH3NFAKMIDBAEVLBFPMWKm , RGHRGKAVFlARD 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCSLGG VGMMCSLSEQGXQIAI QVSNILGMDVCGI DLL 
MKDDGSFCVCEANANVGFI AFDKACNLDVAGI XAD YAASLLPSG 

RLTRRMSLLSWSTASETSEPELGPPASTAVDNMSASSSSVDSD 
PESTERELLTKLPGGLFNMNQIjLANE IKLT VD 


6550 


2293 


922 


FRVSRDGAPDCGIEQMGLAMEHGGSYARAGGSSRGCWYYLRYFF 
LFVSLIQFLI I LGLVLFMVYGNVH VS TESNLQATERRAEGL YSQ 
LLGLTASQSNLTKELNFTTRAKDAIMQMWLKARRDLDRINASFR 
QCQGDRVI YTNNQR YMAAI I LSEKQCRDQF KDMNKS CDALL FML 
NQKVXTLEVE IAKEKTI CTICDKES VLriNKRVAPPnT.vpr^irTDi? 
LQHQERQLAKEQLQKVQALCLPLDKDKFEMDLRNLWRDSI I PRS 
LDNLGYNLYHPLGSBLASIRRACDHMPS LMS5KVEELARSLRAD 
IERVARENSDLQRQKLE AQQGLRAS QEAKQKVE KEAQAREAKLQ 
AECSRQTQriALEEKAVLRKERDNLAJCELEEKKREAEQLRMELAI 
RNSALDTCIKTKSQPMMPVSRPMGPVPNPQPIDPASLEEFKRKI 
LESQRPPAGIPVAPSSG 


6551 


157 


748 


IQPPDPRNWTLAAYKEKMKELPLVSLFCSCFIiADPLNKSSYKYE 
ADTVDLNWCVISDMEV1ELNKCTSG0SFEVILKPPSFDGVPEFN 
ASLPRRRDPSLEEIQKKLEAAEERRKYOEAELLKHLAEKREHBR 
BVIQICAIEENNNFIKMAKEKLAQKMESNKENREAHLAAMLERLQ 
EKDKHAEE VRKNKELKBE AS R 


6552 


157 


748 


IQPPDPRNWlIiAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE" 
ADTVDLNWCVI S DME VI BLNKCTSGQSFEVILKP PS FDGVPEFN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELLKHLAEKRBHER 
EVIQKAIEE^^^FIKMAKEKLAQKMBSNKE^^^EAHIJAA^^^8RLQ 
EKDKHAEEVRKNKELKEEASR 


65S3 


2 


1B07 


FVWS KMAAHLS YGR VNLNVLR SAVRRELREFLDKCAGSkAIVWD"" 
BYLTGPFGLIAQYSLLKEHEVEKMFTLKGNRLPAADVKMIIFFV 
RPRLBLMDI IAENVLSEDRRGPTRJDFHILFVPRRSLLCEQRLKD 
LGVLGS FIHREB YSLDLI PFDGDLLSMESEGAFKECYliBGDQTS 
LYHAATOLMTI^ALYGTIPQIFGKGECARQVANMMIRMKREFTG 
SQNS I FPVPDNLLLLDRNVDLLTPLATQLTYEGLIDEI YGIQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEELYAEIRDKN 
FNAVGS VLS KKAKI I SAAFEERHNAKTVGE I KQFVS QL PHMQAA 
RGSLANHTSI AELI KDVTTSEDFFDKLTVEQEFMSGI DTDKVNtf 
YIE^CIAQKHSLIKVLi^LVCLQSVCNSGIJCQKVLDYYKREILQT 
YGYEH1 LTLHNLE KAGLLKPQTGGRNNYPTIRKTLRLWMDD VNE 
QNPTD I S YVYSGYAPLSVRLAQLLSRPGWRS IEEVLR ILPGPHF 
BERQPLPTGLQKKRQPGENRVTLIFFLGGVTFAEIAALRFLSQL 
BDGGTE YV I ATTKLMNTGTS W I EALMEKPF 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

dull no clC X CI 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R«Arginine, 
S -Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y= Tyrosine, X -Unknown, *»stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


6554 


119 


1244 


FEMGSQVSVBSGALHWIVGGGFGGtAAASQLQAIWPFMLVDM" 
KDSFHHNVAALRASVETGFAKKTF1SYSVTFKDNFRQGLVVGID 
LKNQMVLLQGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDMVRQVQRSRFI VWGGGSAGVEMAAB I KTEYPEKEVTLIH 
SQVALADKBLLPSVRQEVKBILLRKGVQLLLSERVSNLEELPLN* 
EYR B YIKVQTDKGTE VATNLVI I»CTGI KINSS AYRKAFESRLAS 
SGALRVNBIILQVEGHSNV YAIGDCADVRTP KMAYLAGLHANI AV 
ANI VNS VKQR PLQAYKPGALTFLLSMGRNDGVGQ ISG FYVGRLM 
VRLTKSRDLFVSTSWKTMRQSPP 


6555 


1552 


498 


IHMALLRK INQ VLLFLI* I VTLC VIL YKKVHKGT VPKNDADDES E 
TPBELEEE I P VVI CAAAGRMGATMAAINS I YSNTDANIL FYWG 
LRNTLTRIRKWIEHSKLREINFKIVEFNPMGLKGK1RPDSSRPE 
LLQPLNFVRFYLPIilHQHEKVIYLDDDVIVQGDIQELYDTTLA 
LGHAAAFSDDCDLPS AQD INRIiVGLQNTYMGYLD YRKKAI KDLG 
ISPSTCSFNPGVIVANMTEWKHQRITKQLEKWMQKNVEENLYSS 
SLGGGVATSPMLIVFHGKYSTINPLWHIRHLGWNPDARYSEHFL 
QEAKLLHWNGRHKPWDFPSVHNDLWESWFVPDPAGIFKLNHHS 


6556 


241 


1449 


ASLCKGCFFVTHVLVilltPStQSPPTFGFLLDIDGVLVRGHRVI 
PAAL KAFRRLVNS QGQLRVP WFVTNAGNILQHS KAQELSALLG 
CEVDADQVTLSHS PMKLFS E YHEKRMLVSGQGPVMENAQGLGFR 
NWTVDELRMAFPLLDMVDLERliLKTTPLPRNDFPRIEGVLLLG 
EPVRWETSLQLIMDVLLSlfGSPGAGLATPPYPHLPVLASNMDLL 
WMAEAKMPR FGHGTFLLCIi ET I YQKVTGKBLRYEGLMGKPS ILT 
YQYAEDLI RRQAK RRGWAAP I RKL YAVGDNPMS DVYGANL FHQ Y 
LQKATHDGAPELGAGGTRQQQPSASQSCISILVCTGVYNPRNPQ 
STB P VLGGGEP P FHGHRDLCFS PGLMEASHWNDVNEAVQLVFR 
KEG WALE 


6557 


2*99 


1534 


RMCGRTSC1LLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN 
KS PQS NS P VLLS RLHFEKDADSS ERI I APMRWGLVPS WFKESDP 
SKLQFNTlT?CRSDTVMSKR£FKVPLGKGRRCVVIiADGFYE«QRC 
QGTNQRQP YFI Y F PQI KTEKSGS IGAADSPENWEKVWDNWRLLT 
MAG IFD CWEP PEGGDVLYS YT I ITVDS CKGLSD IHHRMPAI I*DG 
EEAVS KWLD FGEVS TQEALKL I HPTEN I TFHAVSS WNNSRNNT 
PE CLAP VDLWKKELRASG SS QRMLQ WIiATKSPKKEDSKT PQKE 
ESDVPQWSSQFLQKSPLPTKRGTAGLLEQWLKREKEEEPVAKRP 


6558 


21 


1138 


rUGRRRGGR^ELGSCLEGGREAAEEEGEPEVKKRRLLClTEFAS - 
VAS CDAAVAQCFLAENDWEMERALNS YFE PP VEESALERRPETI 
SEPKTYVDLTNEETTDSTTSKISPSEDTQQENGSMFSLITWNID 
GLDLNWLSERARGVCSYIiALYS PDVIFLQEVIPPYYS YLKKRS S 
HYBIITGREEGYFTAIMLKKSRVXLKSQEriPFPSTKNlMRNLLC 
VHVNVSGNELCLMTSHLESTRGHAAERMNQLKMVLKKMQEAPES 
AWIFAGDTTTLRDREVTRCGGLPNNIVDVWEFLGKPKHCQYTWD 
TQMNSNLG I TAACKLR FDRI FFRAAAE EGHI I PRSLDLLGLEKL 
DCGRFPSDHWGLLCNIjDIIL 


6553 ! 
£560 


3 


364 


GPELSGLPraPKKIiHAWQTPIiVWnrnflqpQPQ<TTyrnDAawTiTi^7^ — 

DKSCRCGVCLPSTCPHTVWLLEPTCCDNCPPPCHIPQPCVPTCF 
LLNSCQPTPGLETLNLTTFTQPCCEPCLPRGC 




3 


1435 

] 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQVRDTSSRIAKG 
GWHTKMSLHGASGGHERSRDRRRSSDRSRDSSHBRTESQLTPC 
IRWVTS PTRQHHVEREKDHSSSRPSS PRPQKAS PNGS I S SAGNS 
SRNSSQSSSDGSCKTAGEMVFVYENAKEGARNIRTSERVTLIVD 
NTRFWDPS IFTAQPTO4IX3RMFGSGREHNFTRPNEKGEYEVAB 
3 1 GSTVFRAI LDYYKTG I IRCPDG I S I PBLREACDYLC ISFEYS 
rl KCRDLSALMHELSNDGARRQFE FYliEBM ILPLMVASAQSGER 
BCH IWLTDDD WDWDEE YPPQMGEE YSQ 1 1 YSTKLYR FFKYI E | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AcAlanine, C=Cysteine, D=Aspartic Acidj B- 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
KsHistidine, I»Isoleucine, K=Lysine, 
L^Leucine, M«Methionine, N^Asparagine, 
P«Proline, Q=Glutamine, R=Arglnine, 
S=Serine. T=Threonine, v«=Valine, 
W=Tryptophan, Y« Tyrosine, X=Unknovn, *«=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NKUVA&b VLKbkOLKK 1UL0 IJtti Y FTYKJBJCVKKRPGGRPBVI YN 
YVQRP FIRMS WE KBEGKSRHVDFQCVKSKSITNLAAAAADIPQD 
QLWMHPTPQVDBLDILPIHPPSGNSDLDPDAQNPML 


6551 


3 


1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSI^SPEPSPSWPSRS 
P CPMAALTDLSFM YRWFKNCNL VGNLS E KYVF ITGCD SGFGNLL 
AKQLVDRGMQVLAACFTBEGSQKLQRDTSYRLQTTLLDVTKSES 
IKAAAQV7VRDKVGEQGLWALVNNAGVGLPSGPNEV7LTKDDFVKV 
INVNLVG L 1 3VTLHMIi PMVKRARGRWNMSSSGGRVAVIGGG V C 
VS KFGVEAFSDS I RREL YYFGVKVCI I E PGN YRTAILGKENLBS 
RMRKLWERIiPQETRDS YGEDYFRI YTDKLKNIMQVAR PRVRDVT 
NS MEHAI VSRS PR IRYNPGLDAKLLYI PLAKLPTPVTD F ILS RY 
LPRPADSV 


6562 


X 


1562 


MS TL YD I RAHKAQLLR F FAS S DSNKALEQRRTLHT P KLEH LDRV 
LYEWFLGKRSEGVPVSGPMLIEKAKDFYEQMQLTBPCVFSGGWl* 
WRFKARHGIKKLDASS EKQSADHQAAEQFCAFFRSLAAEHGLSA 
EQVYNADETGLFWRCLPNPTPBGGAVPGPKQGI05RLTVLMCANA 
TGSHRLKPLAI GKCS G PRAFKG IQHIiPVAY KAQGNAWVD KB I FS 
DWFHH1 FVPS VREHFRTIGLPEDS KAVLLLDSSRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVPLQGPHAR 
YNMN0AI FS VACAWNAVPSHVFRRAWRKLW PS VAFAEG S SSEBE 
LEAECFPVKPHNKSFAHILELVKEGSSCPGQLRQRQAASWGVAG 
REAEGGRPPAATSPAEWWSSEKTPKADQDGRGDPGEGEEVAWE 
QAAVAFmVXRFAERQPCFSAQEVGQLRALRAVFRSQQQVRRRR 
GALGAWKVEALQEG PGG CGATAQS PLP CSSTAGDN 


6563 


1319 ~ 


2694 


LARPAQPVLLREPEGAGPPVPAGHLVHHLQGGHLRERAHPDLEA 
HEHPLPCDQMFWRQMGGHIiRMVBANSRGVWGIGYDHTAWVYTG 
GYGGGCFGGLASSTSNIYTQSDVKCVHIYENQRWNPVTGYTSRG 
IiPTDRYMWSDASGLQECTKAGTKPPSLQWAWVSDWPVDFSVPGG 
TDQEGWQYASDFPASYHGSKTMKDFVRRRCMARKCKLVTSGPWL 
EVP P IALRDVS 1 1 PES PGAEGSGHS IALWAVS DKGD VLCRLGVS 
ELNPAGSSWLHVGTDQPFASISIGACYQVWAVARDGSAFYRGSV 
YPS QPAGD CW YH I PS P PRQRLKQVSAGQTS VYALDBNGULW YRQ 
GITPSYPQGSSWEHVSNNVCRVSVGPLDQVWVIANKVQGSHSLS 
RGTVCHRTGVQPHEPKGHGWD YGI GGG WDH I SVRANATRAPRSS 
SQEQEPSAPPEAHGPVCC 


6564 


1 


975 


A^USCALWSYCGRGWSRAMRGCQIjI»GLRSSWPGDLiLSARJLLSQE 
KRAAETHFGFETVSEEEKGGKOTQVFESVAK2CYDVMNDMMSLGI 
HRWKDLIiWKMHPLPGTQLIiDVAGGTGDIAFRFLNYVOSQHQR 
KQKRQLRAQQNLSWEEIAKEYQNEEDSIK3GSRVWCDINKEMriK 
VGKQ KALAQGYRAGIAW VLGDAEELPFDDDKFD I YTIAFG I RNV 
THIDQALQBJUffiVLKPGGRFLCliEFSQVNKPLISRLYDLYSFQV 
IP VLGEVIAGDWKS YQYLVES I RRFPSQEEFKDM I EDAGFHKVT 
YESLTSGIVAIHSGFKL 


6555 


1464 


999 


RSAVANGLTKRRMGLKLNGRYISL IIjAVQIAYLVQAWAAGKCD 
AVFKG FSDCLLKLGDS MANYPQGLDDKTNIKTVCT YWEDFHS CT 
VTALTDCQEGAKDMWDKLRKES KNLMIQGSLFELCGSGNGAAGS 
LLPAFPVLLVSLSAALATWLS F 


6566 


3 


1385 


kyesaqpggtqpepglgarmaihkalvt«iclglplflfpgawaOg 
hvp pgcsqglnplyynlcdrsgawg i vleavagagi vttfvlti 
ilvaslpfvqdtkkrsllgtqvffllgtlglfclvfacvekpdf 
stcas r r flpgvlfa i cfs claahvfalnflarknhg prg wvt f 
tvallltlveviintewliitlvrgsgeggpqgnssagwavasp 
caianmdfvmal i yvmlliilgaflga wpal cgr y2cr wrkhgvfv 
llttatsvai www i vmytygnkqhnsptwddptiai alaanaw 
afvlfyvipevsqvtksspeqsyqgdmyptrgvgyetigkeqkg 
qsmfvenkafsmdepvaakrpvspysgyngqlltsvyqptbmal 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment: containing signal peptide 
(AoAlanine, CaCyeteine, D^Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H-Histidine, I«Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine» R=Arginine, 
S= Serine, T=Threonine, v=Valine, 
W tryptophan, Y= Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








MHKVPSEGAYDI I LPRATANS QVMGS ANSTLRAEDM YSAQSHQA 
ATPPKDGKNSQVFRNPYVWD 


6567 


125 


863 


TKRSNLKAYACS IHH IRTMS YVFVMDSSQTNVPLLQACI DGDFN 
YS KRLLESG FDPNI RDS RGRTGLKLAAARGNVDI CQLLH KFGAD 
LLATDYQGNTALHLCGHVDTIQFLVSNGLKID I CNHQGATPLVL 
AKRRG VNKDVIRLLESLEEQEVKG FNRGTHSKLBTMQTAE SBSA 
MBSHS LLNPNLQQGEGVLSS FRTTWQEFVEDLGFWRVLLLIFVI 
ALLSLGIAYYVSGVLPFVENQPELVH 


| 65G8 


3 


1183 


HASDRLLVLPDNYSHFSQASANLQGPSRTTfiLFHPTlAd *SS Pt<! 
LEGAE LYFN VDHGYIiEGLVRGCKASLLTQQDYINLVQCE TLEDL 
KIHLQTTDYGNF1ANHTNPL7VS K IDTEMRKRLCGEFBYFRNHS 
LEPLS TFLTYMTCS YMI DNVILLMNGALQKKS VKEILGKCHPLG 
RFTEMEAVN IAETPSDLFNAILIBTPLAPFFQDCMSEKALDEtiN 
IELLRNKLYKS YLBAFYKFCKNHGDVTABVMCP ILEFEADRRAF 
IITLNSFGTELSKSDRETIjYPTFGKLYPBGLRLIjAOAEDFDQMK 
NVADH YGVYKPLFEAVGGS GGKTLEDVFYEREVQMMVLAFNRQF 
HYGVFYAYVKLKEQEIRNIVWIAECISQRHRTKIN3YIP2L 


j 6569 


205 


1532 


RRRGPQRLGHGRPTPLLCRWRTAGPSHWEKQARAFQGLRPVDPR 1 
RMSWLFPLTKSASSSAAGSPGGLTSLQQQKQRLIESLRNSHSS1 
AE1QKDVEYRLPFTINNLTIMINILLPPQFPQEKPVISVYPPIR 
HHLMDKQ3VYVTSPLVNNFTMHSDLGKI 1 QSLLDEFWKNP P VLA 
PTSTAFP YL YSNPSGM3 P YAS QGFP FL PP YPP QEANRS ITS LS V 
T^DTVSSSTTSHTTAKPAAPSFGVLSNLPLPIPTVDASIPTSQNG 
FG YKMPD VPDAFPELSELS VS QLTDMNEQ EE VLLEQ FLTLPQLK 
QI ITDKDDLVKS IEELARKNLLLEPSLEAKRQTVLDKYELLTQM 
KSTFEKKMQRQHELSESCSASALQARLKVAAHEAEEESDNIAED 
FLEGKME IDDFLSSFMEKRTI CHCRRAKEEKLQQAIAMHSQFHA 
PL 


6570 


330 


1304 


ARLPRLT FLREG FLYVLLSHWVFVGAPR PPASDS W K KGL VPS AP 
PASRKMG S KALPAPI PLHPSLQLTNYS FLQAVNTFPATVDHLQG 
LYGLSAVQTMHMNHWTLGYPNVHEITRSTITEMAAAQGr..VDAR? 
P FPALP FTTHLFHPKQGAIAHVIiPALH KDRPRFDFANLAVAATQ 
BDPPKMGDLSKLSPGLGSPISGLSKLTPDRKPSRGRLPSKTKKR 
FI CKFCGRHFTKSYNLLIHERTHTDERPYTCDICHKAFRRQDHL 
RDHRY IHSKEKPFKCQECGKGFCQSRTLAVHKTLHMQTSS PTAA 
SSAAKCSGETVICGGT 


6571 


169 


65S 


APDMNRKKLQKLTDTLTKNCKHL FRG FDKDNDG CVNVLEWI KGL 
S L FLRGSLEEKMKYC FE VFDLNGDGF I SKEEMFHMLKNSLLKQ P 
S EEDPDEG I KDLVE I TLKKMDHDHDG KLS FADYBLAVREETLLL 
E AFGP CLP DPKSQME FEAQVFKD PNE FNDM 


6572 




1645 


TPERAQPGALLGAAGCCVCGGRWWPRSHERGYFSSAKMGSKRRN 
LSCS ERHQ KLVDENYCKKLHVQALKNVNS Q I RNQMVQNSNDNRV 
QRKQFLRLLQNEQFEIiDMEEAIQKAEENKRLKELQLKQEEKLAM 
BIiAKLKHESLKDBKMRQQVRENS IELRELEKKLKAAYMNKERAA 
V^-'^iVi-'A-i- 1 iiUMIiuvUAr»X/iitl MMiii!*HJlKIIKEEltAAEDKRNKA 
KAQYYLDLEKQLBEQEKKKQBAYEQLLKEKLMIDEIVRKIYEED 
QI,BKCX3KLEKMNAMRRYIEBFQKEQALWRKKKREEMEEENRKri 
EFANMQQQREEDRMAKVQENE EKRLQLQNALTQKLBEMLRQRED 
LEQVRQELYQEEQAEIYKSKLKEEAEKKLRXQKEMKQDFEEQMA 
LKELVLQAAKEBEENFRKTMLAKFABDDRI E LMNAQ KQRMKQLE 
HRRAVE KL IEBRRQQ FLADKQRELEBWQIiQQRRCXSFINAI IEEE 
RLKLLKEHATNLLGYLPKGVFKKEDDIDLLGEEFRKVYQQRSEI 
CEEK 


6573 


767 


275 


gggggesq^^c^gtrtpat!x:LMyLqgprk1^c^ydmvqk 

LFLDFFRRRLSQRPTAEELEQRNI LKPRNEQEEQBEKRE I KRRL 
TRKLSQRPTVEELRBRKILIRFSDYVEVADAQDYDRRADKPWTR 
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1 SEQ 
ID 
NO: 


1 Predicted 
1 beginning 
1 nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

| sequence 


Predicted end 
nucleotide 
location 
c or re spo ndi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segmenc containing signal peptide 
(A=Alanine, C-Cystcine, D=Aspartic Acid, s» 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=»Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, YoTyrosine, X-Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\gpossible nucleotide insertion) 


6574 


j 204 


1159 


IiTAADKVSRQECWRVGGRTVCWVSLGSPLGSV 

LBSSVPVSVGVFWACGVSWTGAAGI^DGALSbTMAmEKAMT^ 

IJ^FKOAQI^KGKVKERRPFLASECTBLPKAEKWRRQIIGBISiC 

KVAQIQNAGLGEPRIRDLNDEINKLLREKGHWEVRIKBLGGPDY 

GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELPEKEPLPP 

PRKTRAELMKAJDFEYYGYLDEDDGVIVPLEQEYEKKLRABLVE 

KWKABREARIiARGEKBEEEEBEEEIN I YAVTEEESDEEGSQEKG 

GDDSQQKFIAHVPVPSQQEIBEAIiVRRKKMELLQKYASETLQAO 
SEEARRLLGY 


f 6575 
[ 6576 


117 


820 


" S PAIiASQS GG 1 TEB KMLEPQENG VI DLPDV EHVEDET FP P FPPP "~ 
ASPERQDGEGTEPDEESGNGAPVPVPPKRTVKRNIPKLDAQRLI 
SERGLPALRHVFDKAKFKGKGHSAEDLKMLIRHMBHWAHRLFPK 
LQFED F I DRVE YLGS KKE VQTCLKR IRLDLPI LHEDFVSNNDE V 
AENNEHDVTSTELD P FLTNLSE S EM FASELS I S LTEEQQQRI ER 
NKQIALBRRQAKLP 


h 5577 


j 1 


1060 


PBPQALVGQKRGALRLLVARLVtT\/sAPAEVRRRVLRPVLSWMD 
RETRAIlADSHFRGLGVDVPGVGQAPGR VAFVS E PGAFS YADFVR 
GFLLPNLPCVFSSAFTQGWGSRRRWVTPAGRPDFDHLLRTYQDV 
WPVANCGVQEYNSNPKEHMTLRDYITYWKEYIQAGYSSPRGCL 
YL ^WHLCRDF PVEDVFTLP VYFSSDWLNEF WDALDVDDYRFVY 
AG PAG 5 WS P FHADI FRS FSWSVNVCGRKKWLL FPPGOEEALRDR 
HGNLPYDVTSPALCDTHLHPRNQLAGPPLEITQEAGEMVFVPSG 
KHHQ VHNL VMCC FSC PLSGAFLQEDGSTTS PLSQ PELGWNG VAH 
G 




2271 


9B7 


HDRflASDD&'iJivils^WLEAPYKlCEEDECXJRKEVKKDYPSNTTSS 
TSNSGNETSGSSTIGBTSNRSRBRDRYRRRNSRSRSPGRQCRHR 
SRS WDRRHG SESRS R DHRREDR VHYRSPPLATG YR YGHS KS PHF 
RBKS PVRJ3PVDNLSPEERDARXVFCMQLAARIRPRDLEDFFSAV 
GKVRDVRIISDRNSRRSKGIAYVEFCEIQSVPLAIGLTGQRLLG 
VPI I VQAS OAEKl^RIiAAMANI^QKGNGGPKRL YVGSLHFNI TED 
MLRGIFEPFGKIDNXVLMKDSDTGRSKGYGFITFSDSECARRAI, 
EQLNGFEIJIGRPMRVGHVTERIiDGGTDITFPDGDQELDLGSAGG 
RFQLMAKLAEGAGIQLPSTAAAAAAAAAAQAAALQLNGAVPLGA 
LN PAALTALS PALNLASQCLQLSSLFTPQTM 


1 6578 1 


377 


1489 


PSSSATMNRAPblO^TILHMALTGASDPSAEi^ANGEKPFLLRA 
LQIALWSLYWVTSISMVFLNK^IJ^SPSLRLDTPIFVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRLDLRVARSVLPLSWFIG 
MITFNNIiCLKYVGVAF YNVGRS LTTVFNVLLSYLLLKCJTTS FYA 
LLTCG 1 1 IGGFWLCSVDQEGABGTIjS WLGTVEGVLASLCVSLNAI 
YTTKVLPAVDGSI^LTFYNNVNACILFLPLLLLLGBLQALRDF 
AQLGSAHPWGMMTLGGLFGFAIGYVTGLQIKFTS PLTHNVSGTA 
KACAQTVLAVLYYEETKS FLWWTSNMMVLGGS SAYfWVRGWEMK 
KTPEEPS PKDSEKSAMGV 


| 6579 


2 


711 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYFTSSSVNSSAYT 
IYMGKl>KYENEDLIIO^WPEDIWFHVDKI^SAHVYLRLHKGENl 
ED IPKEVLMDCAHLVKANS TOGCKMNNVMWVT DM c wt irvr* hm 

DVGQ I G FHRQKD VK I VTVEKKVNE ILNR LEKTKVE R FPDLAAE K 

ECRDREERNEKKAQIQBMKKREKEEMKKKREMDELRSYSSLMKV 
ENMSSNQDGNDSDBFM 


j 6580 


62 


1571 


LVALKNWKPKGTNI PAPQSPVFGEAVSG VYMMTKVLGMAPVLGP 
RPPQEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFRQFGYHDTPG 
PREALSQLRVLCCEWLRPEIHTKEQILELLVLEQFLTILPQELQ 
AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 
KISSSGTAKESPSSMQPQPLETSHKYESWGPLYTQESGEEQEFA 
aDPRKVRDCRLSTQHEESADEQMSSEAEGLKGDI IS VI IANKPE 
\3 LERQCVNLENEKGTKPPLQEA3SKKGRES VPTKPTP GERRY I 
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£>eq 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

residue of 
amino acid 
sequence 


Amino acid segment containing 9ignal peptide"" 
(A= Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H»Hxsta.dxne, I=>Isoleucine, K*Lysine, 
^Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, RsArginine, 
S=Serine, ^Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X»Unknown, ♦-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CAECGKAPSNSSNLTKHRRTHTGEKPYVCTKCGKAFSHSSNLTl* 
H YRTHLVDRP YD CKCG KA FGQS SDLLKHQRMHTEEAP YQCKDCG 
KAFS GKGSLIRHYRIHTGEKP YQCNB CGKS FSQHAGLS 5HQRI*H 
TGEKPYKCKECGKAFNHSSNFNKHHRIHTGEKPYWCHHCGKTFC 
SKSNLSXHQRVHTGEGEAP 






476 


RVFL KDLS S TPMASNNTAS I AQAR KL VE QLKMEAN I DR I KVS KA 
AADLMAYCEAHAKBDPLbTPVPASENPFREKKFFCAIL 


6582 


1428 


| 718 


CFTTKTHCSPVSVPYLSPLVljRKErjESLLENEGDQVIHTSSFIN 
QHPI I FWTLVWYFRRLDLPSNLPGLILTS EHCNEGVQLPLSSLS 
QDS KLVYI QLLWDNINJUHQE PREPLYVS WRNFNSE KKSS LLS EE 
QQETSTLVETIRQS IQHNNVLKP INLLSQQMKPGMKRQRSLYRE 
ILPLSLVSLGRENIDIEAFDNEYGIAYNSLSSEILBRLQKIDAP 
PSASVEWCRKCFGAPLI 


6583 


487 


41 


RI PSMTS GRliRWR CT WRPATALWSAS LRLGTS SMH P S PRS ISL P 
LSMMLSPLPSNTRGLSPTALFRS PDSEKATS CPPXHLWRCRAPL 
RSPSPK3RLQVLPRSPLHVHTHNSGKEVLGLQVQRSRSGTGPAC 
SQAGSGAVQGGNWCIF 


5584 


189 


1750 


PLPMAALG PS S Q^XVTE Y VVRVPKNTTKKYN 1 MAFNAADKVNFAT 
WNQARLERDLSNKKI YQEEEMPE SGAGS EFNRKLREEARRKKYG 
iVLiOBFRPEDQPWLLRVNGKSGRKFKGIKKGGVTENTSYYIFTQ 
CPDGAFEAFPVHNWYNFTPLARHRTLTAEBAEEBWERRNKVLNH 
FSIMQQRRLKDQDQDBDEEEKBKRGRRKASEIiRIHDLEDDIiEMS 
SDASDASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSDDBAFEDS 
DDGDFSGQEVDYMSDGSSSSQEEPESKAKAPQQ2E3PKGVDEQS 

ESDIDSEASSAFFMAKKKTPPKRERKPSGGSSRGNSRPGTPSAE 
GGSTSSTLRAAASKLEQGKRVSEMPAAKRLRLDTGPQSLSGKST 
PQPPSGKTTPNSGDVQVTEDAVRRYLTRKPMTTKDLLKKFQTKK 
TGLSS EQTVNVLAQILKRLNPERKM INDKKHFSLKE 


65B5 


3 


1678 


gpirnsriddfvggdpi^asc^vlhskphamadsrdpasdqmqH 
hwkeqraaqkadvi .ttgagnpvgdklnvttvgprg pllvqdwf 
tdbmahfdrbri pervvhakgagafgyfevthditkyskakvfe 

HIGKKTPI AVRFS TVAGESGS ADTVRD PRGFAVKF YTEDGNWD h 
VGNNTPIFFIRDPII^PSFIHSQKRNPQTHIiKDPDMVWDFWSLR 
PESLHQVSFLFSDRGIPDGHRHMNGYGSHTFKLVNANGEAVYCK 
FHYKTDOGIKNLSVEDAARLSQEDPDYGIRDLFNAIATGKYPSW 
TFYIQVMTFKQAETFPFNPFDLTKVWPHKDYPLIPVGKLVLNRN 
PVNYFAEVEQIAFDPSNMPPGIliASPDKMLQGRLFAYPDTHRHR 
LGPNYliH IPVNCPYRARVANYQRDGPMCMQDNQGGAPNY YPNSF 
GAPEQQPSALBHS IQYSGEVRRFNTANDDNVTQVRAFYVNVLNE 
EQRKRliCENIAGHLKDAQ I F I QKKAVKNFTEVH PD YGSHI QALL 
DKYNAEKPKNAIHTFVQSGSHLAARBKANL '■ 


6586 


32 


804 


P I*PEQ PASS TSTMPVSGTPAPNKKR KSS KIiIMELTGGGQESSGL 
NLGKKISVPRDVMLEELSLIiTNRGSKMFKLRQMRVEKFIYENHP 
DVFSDSSMDHFQKPL PTVGGQLGTAGQGFS YS KSNGRGGSQAGG 
SGSAGQYGSDQQHHLGSGSGAGGTGGPAGQAGRGGAAGTAGVGE 
TGSGDQAGG EGKH I TVFKTY I S PWERAMGVDPQ QKMBLG IDLLA 
YGAKAELPXYKSFNRTAMPYGGYBKASKRMTFQMPKV | 


6587 


75 


1117 


RRVPSLGKM PE C WD GEHDI ET P YGLLHWIRGS PKGNR PAI LTY^ 
HDVG&NHKLCFNTF FNFEDMQE ITKHF WCHVDAPGQQVGASQF 
PQGYQFPSMEQIAAMIiPSVVQHFGFKYVIGI GVGAGAYVLAKFA 
LIFPDLVEGLVLVNIDPNGKGWIDMAATKLSGLTSTLPDTVLSH 
LFSQE ELVNNTELVQS YRQQ IGNWNQANLQLFWNMYNSRRDLD 
INRPGTVPNAKTLRC PVMLWGDNAPAEDGWECNSKLDPTTTT 
PLKMADSGGIiPQVTQPGKLTEAFKYFLQGMGYMPSASMTRLARS 
RTASLTSASS VDGSR?QACTHSESSBGLGQVNHTMEVS C 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
v ** fMaiuuc * ^ B ^y8ceine , D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine , GoGlycine, 
H»Histidine, I«Isoleucine, K=Lysine, 
L-Leucine, Methionine, N=Asparagine, 
P«Proline, Q=Glutaraine, R^Arginine, 
S=Serine, T=Threonine, V«*Valine, 
W*Trypcophan, YsTyxosine, X= Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


6588 


13 7 


! 501 

i 


LGLQAQLLiSijRTNNYQIjSDELRKNGrVELTSLRQKVAyLDKEFSK 
AQKALSKSKKAQHVEVLLSENEMLQAKLHSQEEDFRLQNSTLMA 
&c &aj^sqmeqijBQENQQLKEGAAGA0VAQAGP 


6589 


2 


1405 


RPWGSAMAT^SRQEFFQQLljQGCLLPTAQQGLDQIWLCCSiCIA 
CRLLWR LGL P S YLKHAS TVAGG ?FS L YHFFQLHMVW WLLS LLC 
YLVL FLCRHS S HRG VFLS VTI L I YLLMGEMHMVDTVTWHKMRGA 
QMIVAMKAVSLGFDLDRGEVGTVPSPVEFMGYLYFVGTIVFGPW 
IS FH3 YLQAVQGRPLS CRWLQ KVARS LALALLCLVLSTCVG PYL 
FPYFIPI^GDRU^RNKKRKARGTMVRWLRAYESAVSFHKSNYFV 
GFLSEATATLAGAGFTEEKDHLEWDLTV55KPLNVELPRSMVEW 
TS WNL PMS Y W LNNYVFKNAL RIX3TFSAVLVT YAASALLHGFS FH 
LAAVLLSLAF ITYVEHVLRKRLAR ILSAC VLSKRCPPDCSHQHR 
LGLG VRALNL L FGALA I FHLAYLGS L FDVDVDDTTEEQG Y3MA Y 
TVHKWSELSWASHWVTFGCWIFYRLIG 


6590 


2177 


656 


VRAYEHVI»SLLENVFTPMFCHRDEYFRQIiLRGAESPTRNSICLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGIWMEDDSPVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPSSZRKBKKER I P VFCI D VERNDRRAVGHE PEHWS V YR 
RYLEFYVLBSKLTEFHGAFPDAQLPSKRIIGPXNYEFLKSKREB 
FQEYIjQKLLQHPELSNSQIjLADFLS pnggetqfldki LPDVNLG 
KI IKS VPG XLMKEKGQHLEPF I MNFINS CES PKPKPS RPE LTIL 
S PTSENNKKLFNDLFKNNANRAENTERKQNQNYFME VMTVEGVY 
DYLMYVGRWFQVPDWLHHLLMGTRIIiFKNTLEMYTDYYLQCKL 
EQLFQ EHRLVSLXTLLRDAI FCEWTE PRSLQDKQKGAKQT FEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VI QE L FPELNKVQKE VTS VTS WM 


6591 


2177 


cT2 

obn 


VRAYE>-iVI^I^ENVF^PMFCHRDEYFRQI«LRGAESPTR^SXLNR~ 
GSLS LDDFRNTQKRGBS FGI S R I GS XIKGVFKS TTMEG AML PNY 
G VABGE DDF I BEG I WME DOS P VBAVSTPNTPRNLAAWK I S I P Y 
VDFFEDPSSBRKEKKBRI PVFCID VERNDRRAVGHEP BHWS VYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQEYLQKLLQHPBLSNSQLLADFLSPNGGETQFLDKILPDVNCiG 
KIIKSVPGKLMKBKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRABNTERKQNQNYFMEVMTVEGVY 
DYLMYVGRWFQVPDWLHHLLMGTRILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVS LI TLLRDAIPCENTEP RS LQDKQKGAKQT FEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLKKQLTYVLLDI 
VIQELFPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLIiQEAEORLKAIVAEKFAIATKEG 
DLPQVER FFK I FPLLGLHEEGIiRKFS E YLCKQVAS KAEBNLLMV 
LGTDMSDRRAAVI FADTLTLLFEG IARIVETHQP 1 VETYYGPGR 
LYTLIKYIjfiUPi^nwm/irvTTTrrM/wT v ADnvrt A^nn - _ 
w * i " AA41 ^ v «^^*Wv&&VViJJ^lKQRDYHQ 

TTEKI EPRELDPI LTEVTLMNARS EL YLRFLKKR ISS DPE VGDS 
MASEEVKQEHQKCLDKLLNNC^jLSGTMQELIGLYVTMEBYFMRE 
TWKAVALDTYEKGQLTSSMVDDVFYIVKKCIGRALSSSSIDCL 

caminlattelesdfrdvlcnklrmgfpattfqdiqrgvtsavn 

IMHSSU)QGKFI)TKGlESTDEAKMSFLvTtJ9NVEVCSBNISTLK 

ctlesdctkltsqgiggeqagakfdsclsdlaavsmkfrdllqe 

GLTEIiNSTAIKPQVQPWINSFFSVSHN'IEEEEFNDYEANDPWVQ 
QFILNLEQQMABFKASIiS PVI YDSLTGIiMTSLVAVELEKWLKS 
TFNRLGGLQ FDKELRSL I AYLTTVTTWTIRDKFARLS qmatiln 

lervteildywgpnsgpltwrltpaevrovlalridfrsedikr 
lrl 


6593 


3 


1837 


bafsagsrrrglalqrgvlgglggycpcccrrrgrllvllllvr " 

rggeggggrgrgdkrpj^qarrqrrrpepaearggkmadvlsvl 

rqyniqkkeivvkgdevifgefswpknvkt>tyvvwgtgkegqpr 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to fir6t 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H-Histidine, I-Iooleucine, K^Lysine, 
L«Leucine, Methionine, N=Asparagine, 
P=Proline, QaGlut amine, R»Arginine, 
SaSerine, T=Threonine, VoValine, 
W=Tryptophan, V=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








E Y YTLDS I L FLLNNVHLSH PVY VRRAATENI P WRRPDRKDLLG 
YLNGE AS TS AS I DRSAPLE IGL QRSTQVKRAADEVLAEAKKPR I 
EDEBCVRLDKERLAARLEGHKBGIVQTEQIRSLSEAMSVEKIAA 
I KAK IMAKKRS T I KTDLDDDI TALKQRS FVDAEVD VTRDI VSRE 
RVWRTRTT I LQS TGKN FS KNI FAI LQS VKAREEGRAPEQR PAPN 
AAPVDPTLRTKQPIPAAYNRYDQERFKGKEETEGFKIDTMGTYH 
GMTLKS VTEGASARKTQTPAAQ PVPRP VSQARPP PNQXKGSRTP 
III IPAATTSLITMLNAKDLLQDLKFVPSDEKKKQGCQRENETL 
I QRRKDQMQPGGTAI S VTVPYRWDQPLKLMPQDWDRVVAVFVQ 
G PAWQFKGWPWLLPDGS PVDI FAKIKAFHLKYDEVRLDPNVQKW 
DVTVLELSYHKRHLDRPVFLRVWETLDRYMVKHKSHLRF 


6594 


1 


1096 


EFPGRKFRGSQASPLCATCGPALLRAPTRAAMTRSLFKGNFWSA 
D I LSTIGYDNI IQHLNNGRKNCKE FEDFLKERAAIEERYGKDLL 
NLSRKKP CGQSEINTLKRALEVFKQQVDNVAQCHIQLAQS LREE 
ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQFKKTMDAKKNYE 
QKCRDKDEAEQAVSRSANLVNPKQQEKLFVXLATSKTAVEDSDK 
AYt^IGTLDKVREBWQSEHIKACEAFEAQECERINFFRNALWL 
HVNQLSQQCVTSDEMYEQVRKSLEMCSIQRDI EYFVNQRKTGQ I 
P PAP I MYEN FYSSQKNAVPAGKATG PNLARRGPLP I PKSSPDDP 
; NYSLVDDYSLI.YQ 


65S5 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKILRDWLYLH 
RYNAYPSEQEKLSLSGO/TNLSVLQICNWFINARRRLLPDMLRKD 
GKDPNQFTISRRGGKASDVALPRGSSPSVLAVSVPAPTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGELESPKPLVTPGSTLTLLTRAEA 
GSPTGGLFNTPPPTPPEQDKEDFSSFQLLVEVALQRAAEMELQK 
QQDPS LPLLHTPIPLVS ENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
PI YQLNAPWLKGQERADLSNSLEEI YIQNIGES ILYLWVEKIRD 
VLIQKSQMTEPGPDVKKKTEEEDVECEDDLILACQPESSVKAIjD 
FDI SETRTEVEVEELPP IDHGIP ITDRRSTFQAHLAPVVCP KQV 
KM VLS KLYENKKIAS ATHN I YAYR I YCEDKQT FLQDCEDDGETA 
AGGRLLHLMEILNVKNVMVWSRWYGGII^PDRFKHINNCARN 
I LVEKNYTNSP EES SKALGKNKKVR KDKKRNEH 


6597 


2 


1026 


PSLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAI Y 
GEEWCVIDDCAKIFCIRISDDXDDPKWTLCLQVMLPNEYPGTAP 
P I YQLNAPWLKGQERADLSNS LEE I Y IQNIGES I L YLV7VEK IRD 
VLlQKSQMTEPGPDVKKKTBEEDVBCEDDLrLACQPESSVKALD 
FDI SETRTEVEVEELPP IDHGI PITDRRSTFQAHLAP WCPKQV 
KMVLS KLYENKKI ASATHNI YAYRI YCEDKQTFLQDCEDDGETA 
AGGRLLHLME I LNVKNVMVWS RW YGGI LL3PDRFXHINNCARN 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6598 


1099 


4l9 


PRVRWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWC3L 
VLSFCRLHKQSSMTVMEAQESPLPNNVKLQRKLPVESIQIVLEE 
LRKKGNLEWLDKSKSSFLIMWRRPEEWGKLIYQWVSRSGQNNSV 
FTL YB LTNGEDTEDEEFHGL DEATLLRALQALQQEH KAE I ITVS 
DGPRRQVLLAGTCLPLLLTS HLS RAFKRRQTQC P PKTGSVTP PD 
SKGLQS 


6599 


164 


1593 


KMAALTTLFKYI DENQDR YI KKLAKWVAIQS VSAWPBKRGE I RR 
I4MBVAAADVKQLGGSVELVDIGKQKLPDGSEIPLPPILLGRLGS 
DPQKKTVC I YGHLDVQPAALEDG WDS BP FTLVERDG KLHGRGS7 
DDKGPVAGW INALEAYQKTGQE I PVNVRFCLEGMEESGSEGLDE 
LIFARKDTFFKDVDYVCISDNYWLGKKKPCITYGLRGICYFFIE 
VE CSNKD LHSGVYGGS VHEAMTDLI LLMGS L VDKRGN IL I P GIN 
E AVAAVTE EEHKL YDD IDFD IEEFAKDVGAQ I LLHS HKKD ILNH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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SEQ 
ID 

NO: 


Predicted 
□eg inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

acvju CUC.C 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=-Aspartic Acid, B=» 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L«Leucina, MaMethionine, N=Asparagine , 
P»Proline, Q^Glutamine, R*Arginine, 
S=«Serlne, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








WGEQVT3 YLTKKFAELRS PNE FKVYMGHGGKP W VSDFS H PH YL 
AGRRAMKTVFGVEPDLTREGGS I P7TLTFQEATGKNVMLL PVGS 
ADDGAHSQNEKIjNRYNY IEGTKMLAAYLYBVSQLKD 


6600 


2 


934 


pgrlfrvaamesagleqllrelllpdterirrateqlqIUlrap 
aalsalcdllasaadpq irqfaavltrrrlntrwrr laae qre s 

LKSLILTALQRBTEHCVSLSLAQLSATIPRKEGLEAWPQLLQLL 
QHSTHSPHSPEREMGLLLLSVWTSRPEAPQPHHRELLRLIiNET 
LGEVGS PGLLF YSLRTLTTMA PYLS TEDVPLARMLVPKLIMAMQ 
TL I P I DEAKACE ALB ALDELLES B V? VITP YLSE VLTFCLEVAR 
NVALGNAIR I R I LCCLTFLVKVKS KALLKNRLLATLAAHPFPHC 
GC 


6601 


529 


1420 


P RAAARAPP PAVLRR DRRAATAPGAGEMTLHGPLAQR YFLNHI E 
KI TTWQE PRKAMNQP LNH MNLHP AVSSTPVPQRSMAVS Q PNLVM 
NHQHQQQMAP S TLSQQNHPTQNP P AGLMSKPNALTTQQQQQQKL 
RLQRIQMERERIRMRQEELMRQEAALCRQLPMEAETLAPVQAAV 
NPPTMTPDMRSITNNSSDPFLNGGPYHSREQSTDSGLGLGCYSV 
PTTPBDFXSNVDEPiDTGENAGQ'rPMNINPQQTRFPDPLDCLPGT 
NVDLGTLESEDLI PL FNDVESALNKSEPFLTWI, 


6602 


127 


til 


LLDFPALPKFVLAQSPKAGKPSTMTSMTQSLREVIKAKTKARNF 
ERVLGKITLVSAAPGKVICEMKVEEEHTNAIGTLHGGLTATLVD 
NI STMALLCTERGAPGVS VDMNITYMS PAKLGEDI VITAHVLKQ 
GKTLAFTSVDbTMKATGKLIAOGRHTKHLGN 


6603 


| 79 


660 


PVGPSSLAARTGLGHLPFLHRLASSRGLDMDLLQFLAFI.FVLLL 
SGMGATGTLRTSLDPSLEIYKKMFBVKRREQLLALXNLAQLNDI 
HQQYKILDVMLKGLFKVLEDSRTVLTAADVLPDGPFPQDEKLKD 
AFSHWENTAFFGDWLRFPRIVHYYFDHNSNWNLLIRWGISFC 
NQTGVFNQGPHS P ILSLM 


6604 




688 


TSTAQRQGGERMS FRGGGRGGFNRGGGGGGFNRGGS SNHFRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERWLtiGEFL 
HPCEDDI VCKCTTDENKVP YFNAP VYLENKEQ IG KVDE3 FGQLR 
DFYFSVKLSENMKASSFKKLQKFYIDPYKIiLPLQRFLPRPPGEK 

GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


848 


SGSRRGAMRAAGVGLVDCHCHLSAPDF15RDIJDDVLEKAKJKANVV 
ALVAVAEHSGBFEKI MQLS ER YNGFVLPCLGVHP VQGLP PEDQR 
SVTLKDLDVALPIIENYKDRLLAIGEVGLDFSPRFAGTGBQKEE 
QRQVL rRQIOLAKRLNLPVNVHSRSAGRPTIWIiLQEQGAEKVLL 
HAFDGRPSVAMEGVRAGYFFS IPPSI IRSGQQKLVKQLPLTSIC 
LETDS PALGPEKQVRNEP WNI S I SAB YI AQVKG I S VEE VIE VTT 
QNALKLFPKXRHLLQK 


6606 


2 '" ] 


1682 


FVEIR P RAE VANLS AHSAS PI QDAVLKRLS LLSD IV YRQljtf GLS 
KSLGL IEG YGGRG KGGLPATLS PAEE E KAKGPHEKYGYNS YLS E 
KI S LDRS I PDYRPTKCKELKYS KDLPQ IS 1 1 FI FVNEALSVI IiR 
SVHSAVNHTPTHLLKEIILVDDNSDBBELKVPLEEYVHKRYPGIi 

VLSRI QENRKRVILPS I DNI KQDNFEVQRYENSAHGYSWELWCM 
YI SPPKDWWDAGDPS LP IRTPAM IGCS FWNRKFFGE IGLLDPG 
MDVYGGENIELGIKVWLCGGSMEVLPCSRVAHIERKKKPYNSNI 
G F YTKRNALRVAEVWMDDYKS HVYIAWNLPLENPGID IGDVSER 
RALRKS LKCKNFQ WYLDHVYPEMRRYNNT VAYGELRJWTKAKDVC 
LDQGPLENHTAI LYPCHGWGPQLARYTKEGFLHLGAIX3TTTLLP 
DTRCLVDNSKSRLPQOLDCDKVKSSLYKRWNFIQNGAIM^KGTG 
RCLEVENRGLAG IDLI LRS CTGQRWTI KNS IK 


6607 


137 


986 ~ 


VPACAGLKKE7VRSLLASPPRLLNTKLQASCRALFSPPIQSRQTT 
G I S FQGRGG AGPGVPTRTQVFAAMGAVMGTFS SLQTKQRRPSKD 
KIEDELEMTMVCHRPEGLBQLEAQTNFTKRELQVLYRGFKNECP 
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ijEQ 
ID 
NO: 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C=Cysteine, D^^Aspartic Acid, E« 
Glutamic Acid, F=» Phenyl alanine, G=Glycine, 
H=Histidine, loisoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, Threonine, V»Valine, 
W»Tryptophan, Y-Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SGVVNEDTt'KQIYAQFFPHGOASTYAHn.FNAFDTTQTGSVKFE - 
DFVTAtiS ILLRGTVHS KLRWTFNL YD INKDG YINQEEMMD IVKA 
IYDMT4GKYTYPVUCEDTPRQHVDVFFQKMDKNKDGrVTLDEFLE 
SCQEDDN I MRS LQLFQNVM 


£608 


224 


1140 


RPCFSSPTGLCPRLSYPMIU&HAVLPPPKQPSPSPPMSVATRS 
TGTLQL PPQKP FGQBASLPLAGEE ELS KGGEQD CALEBL CKP LY 
CKLCNVTLNS AQQAQAHYO^KNHGKKLRNYYAANSCP PPARMSN 
WEPAATP WP VPPQMGS FKPGGRVI LATENDYCXLCDASFS S P 
AVAQAHYQGKNHAKRLRLAEAQSNSFSESSELGQRRARKEGNEF 
KMMPNRR^YTVQNNSGPYFNPRSRQRIPRDLAMCVTPSGQFYC 
SMCNVGAGEEME FRQHLES KQHKS KVSEQRYRNEMENLGYV 


6609 


1 


443 


FRLRCRRFRVAGGRiAGAGLRESRVPAPBQRLSALTLLSWSAVT 
PAAEPGNFQLSPAEPRGPLASPVRAAPRAPCPAAEMSELNTKTS 
PATNQAAGQEEKGKAGNVKKAEEBEEIDIDLTAPETEKAALAIQ 
GKFRRFQKRKKDPSS 


6610 


319 


881 


GRKSLCNLH IFI RFPLTYPDMYMGMMCTAKKCG IRFQPPAI ILI 
YESE I KGKI RQR I MPVRN FS KFSD CTRAAEQLKNNPRHKS YLEQ 
VSLRQLEKLFSFIiRGYLSGQSLAETMEQIQRETTIDPEEDLNKL 
DDKELAKRKS I MDELFE KNQKKKDDPNFVYDI E VEFPQDDQLQS 
CGWDTESADBF " \ 


6611 


97B 


212 


PGCSGAGSRVWWIiPALRHLAMGSTESSSGRRVSFGVDEEERVRV 
LQGVRLSENVVNRMKEPSSPPPAPTSSTFGLQDGNLRAPHKEST 
LPRSGSSGGQQPSGMKEGVKRYEQBHAAIQDKLFQVAKREREAA 
TKHSKASLPTGEGSISHEEQKSVRLARELESREAELRRRDTFYK 
EQLERI ERKNAEMYKLSSEQFHEAAS KMESTIKPRRVEPVCSGL 
QAQILHCYRDRPHEVLLCSDLVKAYQRCVSAAHKG 


6612 


1724 


992 


VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDSPSTSGGSSDGDQRESVQQEPEREQVQPKKKEGKI 
SSKTAAKLSTSAXRIQKELABITLDPPPNCSAGPKGDNIYEWRS 
TILGPPGSVYRGGVFFLDITFSPDYPFKPPKVTFRTRIYHCMIN 
SQGVI CLDILKDNWS PALT1 S KVLLS I CS IXTDCNPADPLVGS I 
ATQYMTNRAEHDRMARQWTKRYAT 


6613 


130 


748 


ELELS SNMPEQSNDYR VAV FGAGG VGKS S L VLR FVKGTFRES Y I ' 
PTVEDTYRQVISCDKSICTLQITDTTGSHQFPAMQRLSISKGHA 
FILVYSITSRQSLEELKPIYEQICEIKGDV3SIPIMLVGNKCDE 
SPSRB VQS 5 EAEAL?U?'XWKCAFMETS AKLNHNVKELFQELLNLE 
KRRTVSLQIDGKKSKQQKRKEKLKGKCVIM 


6614 


3 


1191 


SSAAEAMRVIiVRRCWGPPLAHGARRGRPSPQWRALARLGWBDCR 
DSRVREKPPWRVLFFGTDQFAREALRALHAARENKEEELIDKLE 
WTMPSPSPKGLPVKQYAVQSQLPVYEWPDVGSGEYDVGWASF 
GRLLNEALILKFPYG ILNVHPS CLPRWRGPAPVIHTVLHGDTVT 
G VTIMQ IR PKRFDVGPI LKQETVPVPPKSTAKELB AV1*S RLGAN 
MLISVLKNLPESLSNGRQQPMEGATYAPKISAGTSCIKWEBQTS 
EQ I FRLYRAIGNI I PLQTXWMANTI KLLDLVEWS SVLADPKLT 
GQAliIPGSVIYHKQSQILLVYCKDGWIGVRSVMLKKSLTATDFY 
NG YLHP W YQ KNSQAQPSQCR FQTLRLPTKKKQ KKTVAMQQC I E 


6615 


832 


35 


GRVGAOASAP^SELPGDVRAFLREHPSLRiQTDARKVRCILTGHE " 

LP CRLPELQVYTRGKKYQRIiVRAS PAFDYAE FEPH IVPSTKNPH 

QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYBECQKQGVEY 

VPACLVHRRRRRBDQMDGTOPRPRBAFWEPTSSDEGGAASDDSM 

TDLYPPELFTRKDLGSTEDGDGTDDFLTDKEDEKAKPPREKATD 

BGRRETTVYRGLVQKRGFCKQLGSLKKKFECSHHRKPKS FSSCKQS 

G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLVITQI PAPRHLRNRPFS FS RGGLDS FS GSLS TPS I CRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=>Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M^Methionine, NaAsparagine , 
P=Proline, Q-Glutamine, R«Arginine, 
SoSerine, T=Threonine, VoValine, 
W=Tryptophan, Y^Tyrosine, X-Unknovm, *=Stop 
Codon, /^possible nucleotide deletion 
\apossible nucleotide insertion) 








PAWVKMAPWPPKGLVPAVLWGLSLFLNLPGPXWLQPSPPPQSSP 
PPQPH P CHTCRGLVDS FNKGLERTIRDNFGGGNTAWEEENLS ICY 
KDSETRLVEVLEGVCSKSDFECHRLLELSEELVESWWFHKQQEA 
PDLFOWLCS D SLKL CC PAfJTPfl PQ nj . Pr ri*rro tn-vr* vr* nr*c*r» 

EGTRGGSGHCDCQAGYGGBACGQCGIiGYPEAERNASHLVCSACP 
GPCARCSGPEESNCLQCKKGWALHHLKCVDIDECGTEGANCGAD 
QFCVNTEGSYECRDCAKACLGCMGAGPGRCKKCSPGYQQVGSKC 

IPESAGFFSEMTEDELWLQQMFFGIIICAIATLAAKGDLVFTA 
IFIGAVAAMTGYWLSERSDRVLBGFIKGR 


6617 


118 


673 


VWMAWQVSliLELEDRLQCPICLEVFKESLMLQCX3HSYCKGCIiVS 
LSYHLDTKVRCPMCWQAVDGSSSLPNVSLAWVIEALRLPGDPBP 
«\. v v nmw r usij r cis. kdqejji CGUCjGI»LGS HQHHP VTPISTVCS 

RMKEELAAIiFSELKQEQKKVDELIAKLVKNRTRlDGSAPSLCPC 
LGPATFTFL 


££l8 


54 8 


136 


DGKVARRAPNS PAFQNDI YPLVSAPRATTAES PWSKVLQNTQCR 

NVPKMTSERS RI PCL S AAAAEGTGKKQQEGRAMATLDRKVPS PE 

AFIXJKPWSSWIDAAKLHCSDNVDLEBAGKEGGKSREVMRIjNKEA 
WKYGT 


6619 


246 


! 842 


passevltaavmflllncivavsonmgigkngdlprpplrnefr 

YFQRMTTTS SVEGKQN LVI MGRKTWFS IP K KNRPLKDR I N L V JbS 
RELKEPPQGAHFIARSIJDEALKLTERPELaNKVDMIWIVGGSSV 

ykeam»fhlghlklfvtrlmqdfesdtffserdlekyklt,peypg 
ilsdvqegkhikykfevcekdd 


6620 


3 


1879 


NSRVDDFVARARMAAENEASQESALGAYSPVDYMSITSFPRLPE 
DEPAPAAPIiRGRKDEDAFLGDPDTDPDSFLKSARLORLPSSSSE 

mgsqdgsplretrkdpfsaaaaecscrqdgltvivtacltfatg 
v i VALiVMUl ifgdfqifqqgavvtdaarctslgievlskqgssv 
daavaaalci^ivaphssglggggvmlvhdirrneshlidfres 
apgalrbetlqrswetkpgllvgvpgmvkglheahqlygrlpws 
q vijvfaaavaqdgf1xvthdiarai»aeqlp pnms erfre tflpsg 
rpplpgsllhrpdlaevldvlgtsgpaafyaggnltlemvaeaq 

HAGGVITEEDFSNy^XT.VPTTPVrviWPrttrT.trr o DnnnuTvnnT t 

SALNILEGFNLTSLVSREQALHWVAETLKIALALASRLGDPVYD 
STITESMDDMLSKVEAAYLRGHINDSQAAPAPLLPVYEliDGAPT 
AAQ VL IMGPDD F I VAMVSSLNQ P FGSGLI TP SGILLNSQMLDFS 
W PNRTANHS AP SLENS VQPGKRPLS FLLPTWRPAEGLCGTYLA 
LGANGAARGLSGLTQVRFTPWLAFFSREPSCGLDCRCLSYLWLV 
SIPHAANMG 


£621 


1 


662 


VQGITSYQQRLQALRKEKSRDAARSRRGKBNFEFYELAKLLPLP " 
AAITSQLDKASIIRLTISYLKMRDFANQGDPPWNLRMEGPPPNT 
SVKVIGAQRRRSPSALAIEVFEAHLGSH1LQSI»DGYVFALNQEG 
KFL YISETVS IYLGLSQVELTGSS VFDYVHPGDHVEMABQLGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPEPWCFPPASDQFLL 


6622 


2 


319 


GRASGAQBETEAGGPERARAMEANMPKRKEPGRSLRIKV'ISMGN 
AEVGKSCI IKRYCEKRFVSKYLATIGID YOVTKVHVRDREIKVN 
I FDMAGHP FFYE VRKPF 


" 6623 


1886 


189 ■ 


KAIiFEKVKKFRIiHVEEGDILYAMYVRQTVLKVIKFLI I IAYNSA 
LVSKVQFTVDCNVDIQDMTGYKNFSCI^TMAHLFSKLSFCYLCF 
VS I YGLTCLYTLYWLF YRSLRB YSFE YVRQETGFDDI PDVKNDF 
AFMLHMIDQYDPLYS KRFAVFLSEVS BNKLKQLNLNNEWTPDKI* 
RQKLQTNAHNRLEIiPLim^GLPDTVPEITELQSLKLEIIKNVM 
1 PATI AQLDNLQELS I»HQCS VK IHSAALSFLKENLKVLS VKFDD 
MRELPPWMYGLRNLBEliYLVGSLSHDISRNVTLESLRDLKSLKI 
LSIKSNVSKlPQAVVDVSSHLQKMCIHI^GTKL\mLNNLKJCKrN 
LTE LELVHCDLERI PHAVFS LLSLQS LDLKENffLKS I EE I VS FQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, (^Cysteine, D^Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine / 
S=Serine, T= Threonine, VaValine, 
W-Tryptophan, Y*Tyrosine, X=Unknown, ♦..stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHIKKJbTSLER^ 

LFLCN KIR YLD LS YND I RFI PPE IG VLQSLQYFS ITCNKVE S L P 

DELYFCKiCLKTLKIGKNSLSVLSPKIGNLLPLSYLDGKGNHPEI 

LPPELGDCRALKRAGLWEDALFETLPSOVREQMKTE 


6624 


218 


1786 


GSRRGGGS RI P AVS TH VAPGRSVijRP FASGALRIiRS-LWAT.fiOf 
RGRPSGLAHIiSQETSfflTRAKRSGRACLGDPPGEILRSFIMKCTA 
REWLRVTTVLFMARAIPAMVVPNATLLEKLLEKYMDEDGEWWIA 
KQRG KRAI TDNDMQS I LDLHNXLRS QVYPTASNME YMTWDVELE 
RS AE S WAE S CLWEHGPASLL PS I GQNLGAHWGRYRPPTFHVQS W 
YDEVKDFSYPYEHECNPYCPFRCSGPVCrrHV T rnt7UWa*rcx7T>T^r» 
AI NLCHNMN I WGQ I WP KAVYLVCNY SPKGNWWGHAPYKHGRPCS 
AC P P SFGGG CRENLCYKEGSDR YYPPREEETNB IERQQSQVHDT 
HVRTRSDDSSRNEVISAQQMSQIVSCEVRLRDQCKGTTCNRYEC 
PAGCLDS KAKVIGS VHYEMQS3 1 CRAAIHYGI IDNDGGWVDITR 
QGRKHYFI KSNRNG IQTIGKYQSANSFTVSKVT VQAVTCETTVE 
QLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNS3LF 


6625 


j 1124 


! 543 


PGPRGGGGSLLSTKALGRSRGLGMHPGPSSGGTEGGVPTALRPP " 
GPLVPSTSfcDNLLKNI BLFDKLALRFKGRLLFLKDVLGDEICCW 
SFYGQGRKI AEVCCTS I VYATBXKQTKVEFPEARIFEETLNILI 
YETPRGFDPALLEATGGAAGAGGAGRGEDEENREHRVRRIHVRR 
HITHDERPHGQQIVFKD 


662C 


3 


14 98 


SAVE FVYTD R PHIjTTdTCTQVFP'T.fgT PCnhTMpg TT » ot — - 
LDVPWPRSKIGSDQDSGIELLNVbHRVILTRESPSIQLASLEW 
RQI I CAAQEHVKEKRRSAEVDDGAAEKETLPEFGEGKDTGGLVP 
GKSLVFATLELCVCI LVRQLPELNPKLTGSPGVKATKPQI LLED 
GSRIiVSAAIiVILSELPAVCSPEGS I S ILPTZL YLTIGVLRETAV 
KLPGGQLSSTVAASLQALKGILSS PMARABKSRTAWTDLLRSAL 
TTI LDCWDPVDETHOELDEVSIjIjTAITVFTT 1 qTqpPVT''rT nrr rv 
KRC I DKFKATLE I KDPWQ I KTYQLLHS I PQYPNPAVS Y P Y I YS 
LASCIMEKLQEIDKRKPBNTAELEIFQEGIKVLETLVTVAEEHH 
RAQLVACLLPILISFLLDENSU3SATSIMRNLHDFALQNLMQIG 
PQYSSVFKSJbVASSPALKARLBAAIKGNQBSVKVKIPTSKYTKS 
PGKNSSIQLKTSFL 




" 6627 


1 


697 


GIPHLSSRDMTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLL " 
GDTGVGKTCTLI QFKDGAFLSGTFIAWGIDFR23KVVTVDGVRV 
KLQ I WDTAGQ B R FRS VTHAYYRDAQALLLLYD ITNKSS FDM IRA 
WLTEI HE YAQRDWIMLLGNKADMS S ER VIRS EDGBTLARE YGV 

PFLETSAKTGMNVELAFLAIAKELKYRAGHQADEPSFQIRDYVE 
SQKKRSSCCSFM 




6628 


1 


1861 


QCAEFGGGSGGGGGSGGGGSGGGRGAGGEENKENERPSAGSKAN*" 
KEFGDSLSLBILQIIKESQQQHGIiRlIGDFQRYRGYCSRRQRRLR 
KTLNF KMGNRHICFTGKKVTBEIjLTDNR YLLIjVLMDAERAWS YAM 
QLKQEANTE PRKRFHLLS RLRKAVKHAEELERLCESNRVDAKTK 
LEAQAYTAYLSGMLRFEHQEWKAAIEAFNKCKTIYHOASAFTB 
EQAVLYNQRVEEISPNIRYCAYNIGDQSAINELMQMRLRSGGTE 
GLLAEKLEALITQTRAKQAATMSEVEWRGRTVPVKIDKVRIFLL 
GLADNEAAIVQAESEETKERLFESMLSECRDAIQWREELKPDQ 
KQRD YI LEGEPGKVSNLQYLHS YLTYIKLSTAI KRNENMAKGLQ 
RALLQQQPEDDSKRSPRPQDLIRLYDIILQNLVEliLQLPGLBED 
KAFQKE I G LKTL V FKA YRC F F I AQS YVL VKKWS EALVL YDR VLK 
YANE VNSDAGAFKNSLKDLPD VQELI TQVRSEKCSLQAAAI LDA 
NDAHQTETSS SQVKDNKPLVE RFETF CLDPSLVTKQANLVHF PP 
GFQPIPCKPL FFDLALNHVAFPPLED KLEQ KTKSGLTGYI KG IF 
GFRS 


1 


5653 


454* 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFEDFPETSEPVW I LG 
RKYS I FTEKDE ILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C=*Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G-Glycine, 
HaHistidine, I=»Isoleucine, JC=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
SaSerine, T«Threonine, VsValine, 
WsTryptophan, Y=Tyrosine, X»Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\epossible nucleotide insertion) 








MLRCGQM I FAQALVCRHIXSRDWRWTQRKRQPDSYFS VLNAFIDR 
KDS YYS IHQIAQMGVGEGKS IGQW YGPNTVAQVLKKLAV FDTWS 
S LAVHI AKDWTWMEBlRRIiCRTS VPCAGATA FPADSDRHCNGF 
PAGAEVTNRPS PWRPLVLLI PLRLGLTDINEAYVETLKHCFMMP 
Q S LGVI GGKPNS AHYF IG YVGEEL I YLDPHTTQPAVEPTDGCFI 
PDESFHCQHPPCRMS1AELDP3IAWRGGHLSTQAFGAECCLGM 
TRKTFGPLRPP FSMLG 


6530 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQ YVKDE FR RHKTVGSDEAQRPLQ KWK V Y ATALIjQQ ANENRQ 
NSTGKACFGTFLPEEKJLNDFRDEQIGQLQELMQSATKPNROFSI 
SBSMKPKF 


6631 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQYVKDEFRRHKTVGSDEAQRFLQEWBVYATALLGX3ANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQEIiMQEATKPNRQFSI 
SESMKPKF 


6632 


1273 


58B 


WNSRGRTQ RGAAPLAPAAAMKAVVQRVtRAS VTVGGEQ 1 S Ail GR 
GICVLLGISLEDTQKELEHMVRKILNLRVFEDESGKHWSKSVMD 
KQYE I LCVSQFTLQCVLKGNKPDFHLAMPTEQAEGFYNS FLEQL 
RKTYRPELIKDGKFGAYMQVHIQNDGPVTIELESPAPGTATSDP 
KQLS KLEKQQQRKE KTRAKGPSE SS KERNTPRKEDRS ASSGAEG 
JDVSSERBP 


6633 


1145 


617 


ATGRHEGVPTLEG I IQQLVNG I ITPATIPSLGPWGVLHSNPMD Y 
AWGANGLDAIITQLLNQFENTGPPPADKEKIQALPTVPVTBEHV 
GSGLE CP VCKDD YALGER VRQLPCNHLFHDG C IVP W LEQHDSCP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


1134 


CGGIPRKGSGPRRRLPKARLRDCLPRLMLTLRSLLPWSLVYCYC 
GLCASIHLLKLLWSLGKGPAQTFRRPAREHPPACLSDPSLGTHC 
YVRIKDSGLRFHYVAAGERGKPlMIiIjLHGFPEFWYSWRYQLREF 
KSEYRWALDLRGYGETDAPIHRQNYKLDCLITDIKDILDSLGY 
S KCVL I GHDWGGM IAWLI AI CYPEMVMKL I VINFPHPNVF7EYI 
TiRHPAQLLKSSYYYFFQIPWFPEFMFSINDFKVLKHliFTSHSTG 
IGRKGCQLTTBDLEAYrYVFSQPGALSGPINHYRNIFSCLPLKH 
HMVTTPTLLLWGENDAFTVIEVEMAEVTRFYVKNYPRIjTILSEASH 
WLQQDQPDIVNKLIWTFLKEETRKKD 


6635 ' 


1420 


470 


EMRAGQQLASMLR W TRAWRJjPREXSIjGPHG PSFAR VP VAP S SS 5 G 
GRGGAEPRPLPLS YRLIJX5EAALPAVVFLHGLFGS KTNFNS I AX 
I LAO^TGRRVLTVDARNHGDS PHSPDMS YElPiS QDLQDLL PQLG 
LVPCVWGHSMQGKTAKD^LALQRPELVERLIAVDISPVESTGVS 
HFATYVAAMRAINIADELPRSRARXLADEQI*SSVIQDMAVRQHL 
LTNLVEVDGRFWRVJJtiDALTQHLDKI LAPPQRQES YLGPTLFL 
LGGNSQFVHPSHHPEIMRLFPRAQMQTVPNAGHWIHADRPQDFI 
AAIRGFLV 


6636 


1514 


1801 


S FCM FSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQ P QRLKH PAE 

QPIVRQCLQRPPLCGVLGPVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSS PCFHDCTC VLDKAGS YKCACIiAG YTGQRCENLLE AGKS Kl 
KASEDSLSVLEBRNCSDPGGPVNGYQKITGGPGLINGRIIAKIGT 
WS FFCNNS YVLSGNB KRTCQQNGE WSGKQPICI KACREPKIS D 
LVRRRVLPMQVQSRETPLHQLYSAAFSKQKLQSAPTKKPALPFG 
DLPMGYQHLHTQIiQYECISPFYRRLGSSRRTCLRTGKWSGRAPS 
CI P I CGK I ENITAP KTQGLRW PWQAAI YRRTSG VKDGS LHKGAW 
FLVCS GALVNERTVWAAHCVTDLGKVTMI KTADL KWLGKF YR 
DDDRDEKT I QS LQI S AI I LHPNYDP ILLDADI AI L KLLDKAR I S 
TRVQPICLAASRDLSTSFQESHITVAGWNVIiADVRSPGFKNDTIi 
RSGWSWDSLLCEBQHEDHGIPVSVTDNMFCASWEPTAPSDIC 
TAETGG I AAVS FPGRAS PE PRWHLMGL VS WS YDKTCSHRLS TAP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 

Louoiij /*po3aiDie nuua.Goi.iuo um tS L> 4. Oil f 

\spossible nucleotide insertion) 








TKVLPFKDWIERNMK 


6638 


1391 


224 


GGI PQAGGKMAAPWWRAALCECRRWRGFSTS AVLGRRTPPLGPM 
PNS D I DLSNLE RLE KY RS FDR YRRRAEQEAQAPHWWRTYREYFG 
EKTDPKEK1DIGLPPPKVSRTQQLLERKQAIQBLRANVEEERAA 
RLRTASVPLDAVRAEWBRTCX3PYHKQRLAEYYGLYRDLFHGATF 
VPRVPLHVAYAVGE DDLM PVYCGNE VTPTEAAQ APEVT YEAEEG 
SIiWTLLLTSLDGHLLEPDAEYTiHWLLTNIPGMRVAEJGQVTCPYIi 
PPFPARGSGIHRLAFLLFKODQPIDFSEDARPSPCYQLAORTFR 
TPD FYKKHQETMTPAG LS FFQ CRWDDS VTYI FHQLLDMRE PVFE 
FVRPPPYHPKQKRFPHRQPLRYLDRYRDSHEPTYGIY 


6639 


2046 


1268 


1G C F IMDGGDDGNL 1 1 KKRFVSEAELDERRKRRQEEWE KVR KPE 
DPEECPEBVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 
EDETNFLDRVSRQQELIEKQRREBBLKELKEYRNNLKKVGISQB 
NKKE VEKKLTVKP IETKNKFSQAKLIiAGAVKHKSSESGNSVKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCPSAAVCIGILPGL 
GAYSGS S DS ES S SDS EGTI NATGK I VSS I FRTNTFLEA P 


"6*40 


117 


1043 


VLEPPDVSMAESEDRSLRIVLVGKTGSGKSATANTILGEEIFDS 
RIAAQAVTKNCQKASREWQGRDLLWDTPGLFDTKESLDTTCKE 
ISR C 1 1 S S CPGPRAIVLVLLLGRYTEEEQKT VAL I KAVFG KSAM 
KHMVILFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NSKKTSKAEKESQVQELVELI BKMVQCNEGAYFSDDI YKDTEER 
L KQREEVLRKI YTDQLNE E I KIiVE EDKHKSEE KKE KEI KLLKLK 
YDEKIKMIREEABRNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 




1 


894 


SMVGRRS^VRGCiX^RPRlto^ARRMb^VPGTDSAPl^G'LAWSS 
AS APPPRGFS AI S CTVBGAPASFGKS FAQKS G YFLCLS SLGSLE 
NP QENWADIQI WDKS PLPLGPS P VCDPMDS KASVS KKKRMCV 
KLI>PLGATDTAVPDVRLSGKTKTVPGYLRIGDMGGFAIWCKKAK 
APRPVPKPRGLSRDMQGLSLDAASQPSKGGLLERTASRXGSRAS 
TLRRNDS I YEAS SL YG I SAMDGVP FTLHPRFEGKSCS PLAFS AF 
GDLTIKSLADIEEEYNYGFVVEKTAAARLPPSVS 


6642 


22 


1296 


PLEERMMTKMDPNDQAQRDI I FELRRIAFDAESDPSNAPGSGTB 

KRKAMYTKDYKMLGFTiraiNPAMDFTQTPPGMLAL^ 

HQDTYIRIVLENSSREDKHECPFGRSAIELTKMLCEILQVGELP 

NEGRNDYHPMFFTHDRAFEEIiFGICIQLLNKTWKEMRATAEDFN 

KVMQVVREQITRALPSKPNSLDQFKSKLRSLSYSEILRLRQSER 

MSQDDFQSPPIVELREKIQPEILELIKQQRLNRLCEGSSFRKIG 

NRRRQERFWYCRLAIiNHKVIiHYGDLDDWPQGEVTFESLQEKIPV 

ADIKAIVTGKDCPHMKBXSALKQNKEVLEIAPS ILYDPDETLNF 

IAPNKYEYCIWIDGLSALLGKDMSSELTKSDLDTLLSMEMKLRL 

LDLENIQIPEAPPPIPKEPSSYDFVYHYG 


6643 


3049 


2265 


SLHAPAEGRTRGRI^KPKMLTRKIKLWDINAHITCRLCSGYLI 
DATTVTECLHTFCRSCLVKYI»BE3^TCPTCRIVIHQSHPLQYIG 
HDRTMQD I VYKLVPGLQ EAEMRKQRE F YHKLGMEVPGD I KGETC 
SAKQHLDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLECNSS KLRGLKRKW IRCSAQATVLHLKKFI AKKLNLSS FNEL 
DILCNEEIIX3KDHTLKFVVVTRWRFKKAPLLLHYRPKMDLL 


4644 


1489 


290 


FR PLATE PRGS S P VQLVSSTMS VRTLP LLFLNLGGEMLY ILDQR 
LRAQN I PGDKARKVLND 1 1 STMFNRKFMEE LFKPQELYS KKALR 
TVYERLAHAS IMKLNQASMDKLYDLMTMAFKYQ VLLCPRPKDVL 
LVTFNHLDTI KGF IRDS PTI LQQVDBTLRQLTE I YGGLSAGEFQ 
LIROTLLIFFQDLHIRVSMFLKDPCVQNNNGRFVLPVSGPVPWGT 
BVPGIiIRMFNNKGEBVKRIEFKHGGNYVPAPKEGSFEFYGDRVL 
KLGTNMYSVNQPVETHVSGS SKNLASWTQES IAPNPLAKEELNF 
LARLMGGMEI KKPSGP BPGFRLNLFTTDEEEEOAALTR PEELS Y 
EVIKI QATQDQQRSEELARI MGEFE ITEQPRLSTS KGDDLLAMM 
DEL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firet 
amino acid 
residue, of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=*Aspartic Acid r e» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, MaMethionine, N=Asparagine, 
P«Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=»Threonine, V«Valine, 
WrsTryptophan, YoTyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


4*45 


6"530 

• 


4646 


FVEGI^GYVYKAASEGKVLTLAALLLNRSESDIRYLLGYVSQQG 
GQRSTPLI IAARNGHAKWRLLLEHYRVQTQQTGTVRFDGYVID 
G ATALW CAAG AGHFE VVXLL VS HG ANVNHTT VTNS T PLRAACFD 
GRLDIVKYLVENNANI S IANKYDNTCLMIAAYKGHTDWRYLLE 
QRADPNAKAHCGATALHFAAK AGHI D I VKBLI KWRAAI WNGHG 
MTPLKVAAESCKADWELLLSHADCDRRSRIEALELLGASFAND 
REN YD I I KTYH YL YIiAMLE RFQDGDNILE KE VLP PIHAYGNRTE 
CRNPQELES I RQDRDALHMEGLI VRERILGADNI DVSHPI I YRG 
AVYADNMEFEQCI KLWbHALHLRQKGNRNTHKDLLRFAQVFSQM 
IHLNETVKAPDIECVLRCS VLEI EOSMNRVKNI SDADVHiraMnw 
YECNLYTFLYLVCISTKTQCSEEDQCKINKQIYNLIHLDPRTRE 
GFi'LLHLAVNSNTPVDDFHTNDVCSFPNALVTKLLLDCGAEVNA 
VDNEGNSALHI IVQYNRPISDFLTLHSII ISLVEAGAHTDMTNK 
QNKTPLDKSTTGVSE I LLKTQMKMSLKCLAARAVRANDINYQDQ 
IPRTLEEFVGFH 


6546 


176 


890 


PSSRMNHLPEDMENALTGSDSSHASLiRNI H<> T N pt*ot ,'m& d t rev — 
EGREKKGISDVRRTFCLFVTFDLLFVTLLWI I ELNVNGG IENTL 
EKEVMQYDYYSS YFD1 FLLAVFRFKVLI LA YAV CRLRH W W AIAL 
TTAVTSAFLLAKVILS KLFSQGAFGYVLPI I S PILAW IETWFLD 
FKVLPQEAEEENRLLI VQDASERAALI PGGLSDGQFYS PPESEA 
GSEEAEEKQDSEKPLLEL 


6647 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIH3INPTQLMARIESY" 
EG RE KKG IS DVRRTFCLFVTFDLLFVTLLWI I ELNVNGG IENTL 
EKE VMQ YDYYSS YFDI FLLAVFRFKVX,ILAYAVCRLRHWwa T at 
TTAVTSAFLLAKVILS KLFSQGAFGYVLPI ISFILAWIETWFLD 

FKVLPQEAEEENRLLI VQDASERAALI PGGLSDGQFYS PPESEA 
GSEEAEEKQDSEKPLLEL 


~~S648 


413 


897 


RNCWN CFTKY FNS PPED I DHKDS YL I TRS I MAEPD Y IEDDN PEL "~ 
IRPQKLlNPVKTSRNHQDLHRELIiMNQKRGLAPQNKPBLQKVME 
KRKRDQVI KQKEEBAQKKKSDLEI ELLKRQQKLEQLELEKQKLQ 
EEQENAPE FVKYKGNLRRTGQEVAQAQES 


6649 


1357 


832 


W I PRAAGI RHE VKWDVKE I MSQHN I YVDALLKEFEQFNRRLNEV 
SKRVRIPLPVSNILWEHCIRIJUTOTIVEGYANVKKCSNEGRALM 
QLDFQQFLMKL EKLTDIRP I PDKE FVETY IKAY YLTENDMER WI 
KEHREYSTKQLTNLVNVCLGSHINKKARQKLLAAIDDIDRPKR 


6650 


32 


765 


LVPL VFS LLVQS CKQVYRS I AMKF VPCLLLVTLS CLGTLGQAPR 
QKQGSTGEEFHFQTGGRDSCTMRPSSLGQGAGEVWLRVDCRNTD 
QT YW CE YRGQPS MCQAFAADP KS YWNQALQELRR LHHACQGAP V 
LRPS VCREAGPQAHMQQVTSS LKGSP E PNQQ PEAGTPSLRPKAT 
VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKPFQALCAFLISFFRG 


6651 


3425 


1353 


AKELLKVGDFSLCAGPYQNTADTMENLSKEPLASFVSESFDISA '" 
CGI ATEHVKIDNSGEGLTAEAG SETLS RDGEVGVNS DMHYELSG 
DSDLDLLGDCRNPRIjDLEDS YTLRGS YTRKKDVPTDG YES S LNF 
HNNNQEDWGCSSWVPGMETSLPPGHWTAAVKKEEKCVPPYVQIR 
DLHG ILRTYANFS ITKBLKDTMRTSHGLRRHPSFSANCGLPSSW 
rSTWQVADDLTQNTLDLEYLRFAHKLKQTIKNGDSQHSASSANV 
FPKESPTQISIGAFPSTKISEAPFLHPAPRSRSPLLVTVVESDP 
RPQGQPRRGYTAS SLDSS S S WRERCSHNRDLRNS QRNHTVS FHL 
NKLKYNSTVKESRNDISLILNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEMYLPFPGRSASYEDII I DVCTNLHVKLRS VVKBA 
CKSTFLFYLVETEDKSFFVRTKNLLRKGGHTE I E PQHFCQAFHR 
ENDTLI 1 1 IRNBDISSHLHQIPSLLKLKHFPS VIFAGVDSPGDV 
LDHTYQEL FRAGGFVI SDDKI LEAVTLVQLKE 1 1 K I LEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKSFQSANIIELLH 
YHQCDS RSSTKAE ILKGLLULQ IQHI DARFAVLLTDKPT IPRE V 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T= Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Uhknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6652 


2 


1343 


FBNNOILVTDVNKFI ENIEKIAAPFRSSYW 

IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPI,'PL~ 
PFPLFLPTOPAERAWIRSRRASEWVGKMBVPRLDHALNSPTSPC 
EEVIKNl^LBAIQLCDRDGNKSQDSGIAEMEEIiPVPHNIKISNI 
TCDS PKIS WEMDS KSKDRITHYFI DLNKKBNKNSNKFKHKDVPT 
KLVAKAVP L P M TVRGH WFLS PR7B YTVAVQ TAS KQVEGD YWS E 
WSE I IEFCTADYSKVHLTQLLEKA2VIAGRMLKFSVFYRNQHKE 
YFD YVREHHGNAMQ P S VKDNSGSHG SPI SG KLE G I FFS CSTEFN 
TGKPPQDS P YGRYRFE I ARE KLFNPNTNLYFGDFYCM YTAYHYV 
ILVIAPVGSPGDEFCKQRLPQLNSKDNKPLTCTEEDGVLVYHHA 
QDVILE7I YTDP VDLS LGTVAE I TGHQLMS LSTANAKKD PSCKT 
CNISVGR 


6653 

• 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ ' 
R VAAAA5 RGADDAME S S KPG PVQVVLVQKDQH S FELDEKALAS I 
u^uvn^nuiji/v v v v^> v f\iy+\jf KAb i\.i> r -li*lJFMl*RYJuYSQKESGHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDSQSTVKDCATI FALSTMTSS VQI YNLSQNIQED 
DLQOLQLFTEYGRLAMDEIFQKPFQTLMPLVRDWSFPYEYSYGL 
QGGMAFIDKRLQVKEHQHEEIQNVRNHIHSCFSDVTCFLLPHPG 
LQVATS PDFDGKLKDI AGEFKEQLQAL I PYVLNPSKLMEKEING 
l3AV 1 ^^•uub i r wii x ivjl xyoisi^JjFHPKoriliQATAEAYNliAAAA 
SAKDIYYNNMBEVCGGEKPYLSPDIIiEEKHCEFKQLALDHFKKT 
KKMGGKDFS FRYQQELEEEI KE LYENFCKHNGS KNVFSTFRTPA 
VLFTGIVALYIASGLTGFIGLEWAOT JTNrMvrn tttmt T*wrrv 

IRYSGQYRBLGGAIDFGAAYVLEQASSHIGNSTQATVRDAWGR 
PSMDKKAQ 




1 


705 


RTSLS PSQ CS S FNIUAMASAGMQI LG VVLTLIiGWVNGLVS CALPM 
WKVTAF1GNS I WAQWWEGLWMSCWQSTGQMQCKVYDSLLAL 
PQDLQAARAL CVIALL VAL FGLLVYLAGAKCTTCVE EKDS KARL 
VLTSGI VF V ISGVLTL I PVCWTAHAVIRDFYNPL VAEAQ KRELG 
ASLYLGWAASGLLLLGGGLLCCTCPSGGSQGPSHYMARYSTSAP 
AISRGPSEYPTKNYV 


6655 


341 


16 


KDAYMFKKGLIALALVFSLPVFAAEHWIDVRVPEQYQQEHVQGA '" 
INI PL K3 VKER I ATAVPDKNDTVKVYCMAGRQS GQAKE I LSEMG 
YTHVENAGGLKD I AMPKVKG 


6656 


2 


1212 


TEI.PPRPANLAIQPPLSPLRALAPLPEKPGAVP.PPQKRMAKVAK 
DLNPG VKKMS LGQLQS ARGVACLGCKGTCSG FE PHS WRKI CKS C 
KCSQEDHCLTSDLEDDRKIGRLLMDSKYSTLTARVKGGDGIRIY 
KRNRM I MTNP IATGKDPTFDTITYEWA? PGVTQ KLG LGYME L I P 
KEKQPVTGTEGAPYRRRQLMHQLPIYDQDPSRCRGLLENELKLM 
EEFVKQYKSEALGVGBVALPGQGGI.PKEEGKQQEKPEGAETTAA 
TTNGS LSDPSKEVE YVCELC KGAAP PDS PWYSDRAG YNKQWHP 
TCFVCAKC5EPLVDLIYFWKDGAPWCGRHYCESLRPRCSGCDEI 

IFAEDYQRVEDLAWHRKHFVCEGCEQLLSGRAYIVTKGQLLCPT 
CSKSKRS 




66$"7 ' 
6658 


35 ■ ' 


2120 
855 


LLTCQERAGiJCLLSASTMKEVVYWSPKKVADWLLENAMPBYCEP 

LEHFTGQDLINLTQEDFKKPPLCRVSSDNGQRLLDMIETIjKMEH 

HLEAHKNGHANOHLNIGVD I PTPDGSFS IKIKPNGMPNGYRKEM 

IKIPMPBliERSQYPMEWGKTFIiAFLYALSCFVLTTVMISVVHER 

VPPKEVQ?PLPDTFFDHFNRV0WAFSICEINGMI1,VGI,WLIQWL 

LLKYKS I ISRRFFCIVGTLYLYRCITMYVTTLPVPGMHFNCSPK 

LFGDWEAQLRRIMKLIAGGGLS ITGSHNMCGDYLYSGHTVMLTL - 

TYLFIKEYSPRRXWWYHWICWLLSWGIFCIIJ^AHDHYTVDVVV 

AYY^TTRLFWWYimiANQQVLKEASQMNiaiARVWWYRPFQYFEK 

NVQGIVPRSYHWPFPWPWHIiSRQVKYSRLVNDT 

HCCALGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGLTKRMLN 
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SEQ 
ID 
NO: 


Predicted 
begi nning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C*Cysteine, IKAspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, KsLysine, 
LnLeucine, M=Methionine, N=Asparagine, 
P^Proline, Q«Glutamine, RsArg.inine, 
S»Serine, T=Threonine, V= Valine, 
WtrTryptophan, Y»Tyrosine, X=>Unknovn, *=»Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








fdpvpvkqeamdpvsvsypsnymbsMkpnkygviystplpekff 

QTPEGLSHGIQMBPVDLTVNKRSSPPSAGNSPSSLKFPSSHRI^A 

SPGLSMPSSSPPIKKySPPSPGVQPFGVPLSMPPVMAAALSRHG 

IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 

MQVPVIBSYEKPISQKKIKIEPGIEPQRTDYYPEEMSPPLMNSV 
SPPQALLQE 


6659 


18 


523 


EPQRGDCETWFQNCSLPKFVCFPCWGFWLWRAHSMSNLHSLPGL 
RGLTS TSRMQIiQCTNAMRVINNYQRRWKNQNTFLLAT FANWNV 
CGNPTITCPHNRTXjNNCHHSGVQVPLMYCNLTTPSPQNISNCRY 
AQTPANMFYI VACDNRDQRRDPPQYP WPVHLHTI I 


6660 


514 


1707 


OUlSLDCRHHU:EPDMia.VWPSAlClXQAAAGASAIU\CDSVTSNV' 
LPLLLEQFHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPL 

ngfkdqlcslvfmaltdpstqlqlvgirtltvlgaqpdllsyed 
lelavghlyrlsflkepsqscrvaalbasgtlaalypvafsshl 
vpklaeelrvgbsnltngdeptqcsrklcclqalsavsthpsiv 
ketlplllqhlwqvnrgnmvaqssdvxavcqslrqmabkcqqdp 
escwyfhqtaipclijuavqasmpekepsvlrkvlledevlaam 

VSVIGTA'ITHLSPBLAAQSVTHIVPLFLDGNVSFLPEWSFPSRF 

QPFQDGSSGQRRLIALLMAFVCSLPRNVSBHIWEVLLFNLDKVT 
PG 


6661 


179 


430 


GVHAASGTLSATWLAEAKMFDSIiAXAGKYLGQAAKIiM IGMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


RSLPKPAPAQPASIHCARFSGVTPPTAKTAMSDGNTAPNALMYC 
GPKADDGNI FSACAPASSAVKAS VSVAQPGQAVTP 


6663 


3 


ioos 


RPVLSSRVDDFVyVLPKTSGRRKKLEHMYSVDRVSDDIPIRTWF" 
PKENLFS FO^TASTTMQAISNFRKHLRMVGSRRVKAQTFAERRER 
S FS RSWSDP TPMKADTSHDSRDS S DLQS SHCTLDEAFEDLDWDT 
BKGLEAYACDTEGFVPPJCVMtilSSKVPKAEYIPTIIRRDDPSII 
PILYDHEHATFEDILEE IERKLNVYHKGAIQ WKMI»I FCQGGPGH 
LYLLKNKVATFAKVBKEEDMIHFWTKRLSRLMSKVNPEPNVTHIM 
GCYILGNPNGEKLFQNLRTLMTPYRVTFBSPLELSAQGKQMIET 
YFD FRLYRLWKSRQKS KLLDFDDVL 


6664 




968 


PRIiRLPRSVVVMDSPWDEIiAIAFSRTSMFPFFDIAHYLVSW 
WRQPGAAALAWKNPISSWFTAMLHCFGGGILSCLLLAEPPLKF 
IiAiraTOTLIASSIWITFFCPHDLVSQOTSYLPVQIiLASGMKEV 
TRTWKIVGGVTHANSYYKNGWIVMIAlGWARGAGGTIITNFERIi 
VKGDWKPEGDEWLKMSYPAKVTLLGSVIFTFQHTQHLAISKHNL 
MFLYTI FI VATKI TMMTTQTSTMTFAPFEDTLSWMLFGWQQPFS 
S CE KKS B AKS PSNGVGS LAS KPVD VAS DNVKKKHTKKNE 


6665 


171 


1278 


UERRLACRQWTQQRSBLYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRWLGPGCTQNPCSVHTATGPEPRKLPLLPPDSPNSGYPKBPA 
ALCPGI PSPCRMTHQDLS ITAKL IWGGVAGLVGVTCVFP ID LAK 
TRLQNQHGKAM YKX3M I DCLMKTARAEGFF^M YRGAAVNLTL VTP 
EKA1 KLAANPFFRRLLMEDGMQRNLKMEMIAGCGAGMCQVVVTC 
PMEMLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 

aiMjn i yv>W4V;Ju I tl\yUJA± LtLtKDlPzfSl X YFPLFANlrNN 

LGFNELAGKASFAHS FVSGCVAGS IAAVAVTPLDVLKTRIQTLK 
KGLG EDMYSG ITD CAR 


6666 


498 


2868 


MTTF LP VPQMMAGFS FGTFGNPPMES P SAWQTI HQ P FI VS CLTL 
WSPGCWPQPZQKBGVGLWD I RKPQSSLLRYGGWLSLQSAMSVRF 
NSNGTQLLAI^RLPPVLYDIHSRLPVFQFDNQVYFNSCTMKSC 
CFAGDRDQ Y ILS GSDDFNL YMWR I PADPBAGG I GRWNGAFM VL 
KGHRSIVNQVRFNPHTYMICSSGVEKIIKIWSPYKQPGCTGDLD 
SRI^DDSRCLYTHEEYTSLVIiNSGSGLSHDYANQSVQEDPRMMA 
PFDSLVRREIEGWSSDSDSDLSESTILQLHAGVSERSGYTDSES 
SASLPRSPPPTVDESADNAFHI^PLRVTTTNTVASTPPTPTCEP 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

amino uniA 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, (^Cysteine, D-Aspartic Acid, Ba 
Glutamic Acid, ^Phenylalanine, G=K31ycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
L«Leucine, M=Methionine r N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
SaSerine, T«Threonine , v=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AASRQQRLSALRRYQDKIUjLALSNESDSEENVCEVBIiDTDIjPPR 
PRS PSPBDESSSSSSS SSSEDEEELNERRASTWQRNAMRRRQKT 
TREDKPSAPIKPTNTYIGEDNYDYPQIKVDDLSSSPTSSPERST 
STLEIQPSRASPTSDIESVERKIYKAYKWLRYSyiSYSNNKDGE 
TSIjVTGEADEGRAGTS H KDNPA PS S S KEACLN I AMAQRNQDXjP P 
HG CS KDT FKE ETPRTP SNG PGHEHS SHAOTAE VPBGTS QDTGNS G 
SVEHPFETKKLNGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSLETICANHNNGRLHPRPPHPHNNGQNLGELEW 
AYS S PGHS DTDRDNSSLTGTLLHKDCCGS EMACSTPNAGTRED P 
TDTPATDSSRAVHGHSGLKRQRIELEOTDSENSSSKKKLKT 


6667 


171 


1310 


ABEVERIiAAMRSDSLVPGTHTPPIRRRSKFANLGRIFKPWKWRK"' 
KKSEKFKHTS AALERKIS MRQSREEL IKRGVLKE I YDKDGELS I 
SNE ED S LENGQS LS SS QLS LPALSEMEP VpMPRD PCS YE VLQPS 
DIMDGPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDLSLV 
SYTAQ KSGQQGVAQHHHTVLPSQ IQHQLQYGSHGQHIj P STTGSL 
PMHPSGCRMIDELNKTIAMTMQRLESSEQRVPCSTSYHSSGIiHS 
GDGVTKAGPMGLPEIRQVPTWIECDDNKENVPHESDYEDSSCL 
YTREEEEEEEDEDDDSSLYTSSIiAMICVCRKDS LAI KPSNRPSKR 
ELEEKNI EjPRQTDEERLELRQQIGTKL 


6668 


714 


358 


TLAVATGPALTLRCHVCTS SSNCKHS WCPAS SRFCKTTNTVE P 
LRGNL VKKD CA E S CTPS YTLQGQVS S GTSSTQ CCQEDLCNE KLH 
NAAPTRTALAHS AZjSLGLALS LLAVI LAPSL 


6669 


459 


1207 


KDEETRKDYDYMLDHPEEYYSHYYHYYSRRLAPKVDVRWI LVS " 

VCAIS VFQFFSWWNS YNKAIS YLATVPKYRIQATB I AKQQGLLK 

KAKEKGK^KKSKEEIRDBEENIIKNIIKSKIDIKGGYQKPQICD 

LLLFQIILAPFHliCSYIVWYCRWIYNFNlKGKEYGEEERLYIIR 

KSMKMSKSQFDSLEDHQKETFLKRBLWIKENYEVYKQEQEEELK 

KKLANDPRWKRYRRWMKNEGPGRLTFVDD 


6670 


184 


594 


VARI*GEAAKMSSEPPPPYPGGPTAPLLEEKSGAPPTPGRSSPA 
VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFIPPHMSADGTYM • 
PPGFYPPPGFHPPMGYYPPG P YTPG P YPG PGGHTATVLV P SGAA 
TTVTV 




1 


763 


bPABKPRS APNMAGGRCG PQ LTALLAAWI AAVAATAGPEEAALP 

peqsrvqpmtasnwtlvmeget'tmlkfyapwcpscqqtdseweaf 
akngeilqisvgkvdviqepglsgrffvttlpaffhakdgifrr 
yrgpgifedlqnyilekkwqsvepltgwkspasltmsgmaglfs 
isgkiwhlhnyftvtlgipawcsyvffviatlvfglsmdlvl^v 

ISQCNWDPPYRHVS * / RPSTNLGVHTAHTSEHLRL 


6672 


304 


1089 


apgskpvqf^fegktsfgmsvfnlsnaimgsgilglXyamaht 
gvi fflalllcialls s ysihllltcagiag irayeqlgqrafg 
pagkwvatvt clhnvgams s ylfi ikselplvigtflymd peg 

DWFLKGNLLI IIVSVLI ILPLALMKHLGYLGYTSGLSLTCMLFF 
LVSVIYKKFQLGLCYRATMKQQWESEALVGTPQPRDSTAAVKAQ 
MFHS*LTGVLTQWPIMAFAFVCHPGGAGPSITELCRAFQAQD 


6673 


1116 


1963 


LQ IQTHHTHHGARVTHLGS HQLLANAGTD1LCRQQS SSMAP AFSQ 
SVTCGPSPCVRKQESATKCLHIGACGSDLWARGMEQG*G*GLNV 
WI J CPCVAFHRGARPQAEEGGARWNSLVSSPWIPPNP*HSSIGAB 
NAVPRP*QG*KVNPSGQERQS\WVLPLPVPGEPLKLPGLPG*NK 
SFSRV/SGSKGKWILPRQLM*AS*R\TPRFVPGTQWVPITW/PL 
ITWH*SAPTPPLKACPAPRESDPCSSCLSCPCVTQKPRFSDTGW 
FGAGHCHS S CDFTRKGAAGGPG 


6674 


1 


440 


LEFD YMCQ YDYVBVRDGDNRDGQI IKRVCGNERPAPIQS IGSSL 
HVLFHSDGS KNFDG FHAI YEE I TACSSSPCFHDGTCVLDKAGS Y 
KCACLAGYTGQRCENLLEERNCSDPG/WPSQWVPENNRGPWAYQ 
PTPC* IGTR VAFFLT 
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SBQ 
10 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C*Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine , 
P=Proline. Q=Glutamine, R«Arginine, 
S^Serine, TeThreonine, VsValine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6675 


277 


1678 


GNWPTRRMAFLDNPTI IIiAHIRQSHVTSDDTGMCEMVLI DHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQIKCKNIQWKERNSKQSACELKSLFS 
KKSLKEKPP ISGKQS I LS VRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKiCIDVYLPLHSSQDMXPtriWrMASARVQDLIGLrcWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSKEPIHKF 
GFSTLAXiVEKYSSPGLTSKESLFVRINAAHGFSLIOVDNTKVTM 
KE ILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIAXVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCAI*FPGVLRKRAAPVDCLRPS 
ADTWRQEQ IGCCGAACAALRS * DS H KC*EG I SGD KVEI D P VTNQ 
KASTKFW I KQKP I S IDS DI»I*CAC\ DLAEE 




277 


1678 


GNWPTERMAFLDNPTI ILAHIRQSHVTSDDTGMCBMVLIDHDVD 
LB KI HP PSMPGD3GS B I QGS NGETQG YVYAQS VD I TS S WD FGI R 
RRSNTAQRtiERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKS LKB KP P I S G KQS I LS VRLEQC PLQLNNPFNE YSKFDGKGHV 
GTTATKKID V YLPLHS S QDRLLPMTVVTMASARVQDLI GL ICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNE PI HKF 
GFSTLALVEKYSS PG LTSKE SLFVR INAAHGFSL IQVDNTKVTM 
KEI LLKAVKRRKGSQKVSGS RADGVFEEDSQIDI ATVQDMLSSH 
HYKS FKVSMTHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAAI»RS*DSHKC+EGISGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6677 


277 


1678 


GNWPTERMAFLDNPT 1 1 LAHIRQS HVTSDDTGMCEMVL IDHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLSRUlKERQNOIKCKNIQWKERNSKQSAGEIiKSIiFE 
KKSLKEKPP I SGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTAT KKI DVYLPLHSS QDRLLP MTWTMAS ARVQDLIGI>I C WQ 
YTSEGREPKLNDNVSAYCLHIABDDGEVDTDFPPLDSNBPIHKF 
GFSTLALVEKYSS PGLTS KESLF VRINAAHG FSL IQVENTKV'CM 
KE I LLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKS FKVSM IHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6678 • 


221 


865 


GPSNQSSGSLSLIVTGCSSYWS*INDTCTILRVLSSNFGRQ*Lk 
PFPCSQLPMSQGCLWHLDCCCPWVP Yl PGQQWRKG3QRMRN *QS 
LLGSDQBSVGLEDLCVFVNFLLHVI,LGLFP*PHELFLLPVVDLG 
FLFPLLbQGGCHCLVLPANLVSQAPQ IGKLSCRLQTHDLEGSRN 
HHPLFLWGRWDAVKHLETVQSGLASLGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRSPGQNWVKTVDGWKRFLDEKSGSFVSDL " 

SSYCNKEVYNKENLFNSLNYD/SCSQEEKEGHAE*QNQNS\DFH 

QEICWZYVHKGSTKERHGYCTLGEAFNRLDFSTAILDSRRFNYVV 

RLLELIAKSQLTSLSGIAQKNFMN3LEKWLKVLEDQQNITLIR 

ELLQTLYTSLCTLVKRVGKSVLVGNINMWVYRMETILHWQQQLN 

NIQITRVSGQAQPPPGSGSLHRDTGQTRQDFEFTPVTEESGLF 


6680 


1498 


2951 


p\rknepdthcprgearpev«hlpkphspgsegaeiqtsa*alp 
/nqvs p pqpm *gaeengdqrggkeeageelhrsssgltaapgf? 
evhrnlqtfpglpsrgggp/ggagtqgswapgbqpp/spllpas 
mqrsqaglpg weaglves pthhipalrpsgtnatgeafpsttcs 
sgp \ pap pgptglrpgggsssgghg* * pglpvgkv\galgaaqd 
pqs qgrg ptqgtvgtemllsglgs akac paarpavp * lps d pas 
tipkkgtrgfgegpgvlqernrwvvgraqgftsadaagtappgv 
* lpaplsqppgate pqvracgmapps pgtsgrlvawgrhpg pqv 
aqgcppgagcwgsqprgsqrcprtyths plghgrapcprrcwh* 

WQDPPS S PRTGCLPGI PARQAYSAPRTRSRPG IRTGRAAYGF IR 
FQGGGGG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H*Histidine, l=Isoleucine, K=Lysine, 
l»=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutarrine, R-Arginine, 
S=Serine, T-Threonine, VoValine, 
W«Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\&possible nucleotide insertion) 


6681 


1169 


511 


lWyiYYNQQtiRAFHELK\BKLMSAPALGliPDLTKiPTLHVSBRE 
KMTVGVLTQTVG P WS RPGAY LS KQLDGVS KG WPPCPRALAATAL 
IAQEADBLTLRQNLNRKSPHA\VVTLINTKGHH*LINARLTRYQ 
TLLCENPHKTI EVSNT/ LNPATLLLVTES P VKHNCLEVLDS VYS 

SRPNLRDHP*TSVDWBLYVDGSGFANPCKVTLKKSTSPAPVTPR 
S 


6682 


109 


1238 


T VLCGAMQVS SLNE VKX YS LS CG KSLPE WLSDRKKRALQfUCDVD 
VRRRIELIQDFEMPTVCTTIKVSKDGQYILATGTYKPRVRCYDT 
YQLSLKFERCI.DSEWTFBILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTR I PKFGRDFS YHY PS CDL YFVGAS SEVYROJLEQGRYTiN 
PLQTI^ENNVCDINSVHGLFATGTIEGRVECTOTOTRNRVGI,I, 
D\AP*TVSQQrQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQ YGLP IKSVHFODSLDLI TiSADSR t WMwnv 

NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANBTPKMGIYYIPVL 
GPAPRWCS FLDNLTEELE ENPESNE 


6683 
' 6684 


109 


1238 


TVLCGAMQ VSS LNEVK.I VSLSCGKS LPBWLSDRiCKRALQKKDVD 
VRRRIELIQDFEMPTVCTTIKVSKDGOYlLATnTYK'PRVT?PVnT 
YQI^LKFERCLDSEVVTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAABNNVCDIN3 VHGLFATGTI EGRVECWDPRTRNRVGLL 
D\AP*TVSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQYGIiPI KSVHFQDSLDL II*SADSRI VKMNNK 
NSGKI FTSL E PEHDIiNDVCLY PNSGMLLTANETPKMG I Y Y I PVL 
GPAPRWCS FLDNLTEELEENPESNE 




111 


527 


GLRGGTSRGRAGREPE FAAG VLCWAGFCQ S P CPPGGRGREAP A 
PP \ SGRRHA* R PA* WLGG PGGDSGGREEGGS /GELQRAMES KMG 

ELPLDINIQEPRWOQSTFU3RARHFFTVTDPRNLLLSGAQLEAS 
RNIVQNYR 


668S 


256 


1473 


KIiLGDWFEGFCNKFELSDShlNGSNS^QSPIj^FDHLFDPDPQKVL 
QG VIDMKNAVIGNNKQKANLI VLGAVPRLLYLLQQETSS TELKT 
ECAWLGSLAMGTENNVKSLLDCHI I PALLQGLLSPDLKFI EAC 
LRCLRTI FTS P VTPEELLYTDATVI PHLMALLS RS R YTQE YI CQ 
I FSHCCKG PDHQTI LFNHGAVQNI AHLLTSLS YKVRMQALKCFS 
VLAFENPQVS MTLVNVLVDGELLPQ I FVKMLQRDKP IEMQLTSA 
KCLTYMCRAGAIRTDDNCIVLKTLPCLVRMCSKERLLEERV3GA 
ETLAYL I EPD VELQRIAS I TDHLI AMLAD YFKYPS S VS A2TD I K 
RLDHDIiKHAHEliRQAAFKLYASLGANDEDIRKKVSLGEGRP PVL 
TASRQGVTST 


6686 


310 


327 


U^VTFDDIAVDFTPKEWTLLDPTQRNLYRDVMLE^KNIiATVGY 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTSSGIQM XGSHNGGE VSDVKQCGDVSSEHS CUCTHVRTQtf 
SENTFE CYLYG VDFLTLHKKTSTGEQRS VFSHVWKKPS SLNPDV 
VCQKNRCTRKKKA F *LQLTLGKS FH* S I HT 1 


| 6687 


181 


915 


JgAWIiEAPYKKEEDEQQRKEVKKDYPSNTT^Sl'SKSGNETSGSST 
IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 
iSRunKKCiUKVHY Kb ypiiA rGEP VDNLS PEERDARTVFCMQLAAR 

IRPRDLEDFFSAVGKVRDVRIISDRNSRRSKGIAYVEFCEIQSV 
PIAIGLTGQRLLGVPIIVQASQAEKNRIiAAMANNLQKGNGGPMR 
LYVGS LHFNITEDMLRGI FEPFGKV 


668B 


1025 


1 


AE VPNYPH VFHKCPDS CWRFK!fQ PJQLQP YILLSF&&EKP PZSF 
SEPGL PR/ S ATARMATAAAP PNSS IDLPS DSGMG F I S PAGDSLD 
LPSDGGTGFFSIAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 
STSVGSWAAPTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 
VAVICX3SKGAGASGSASCSSRAGKTTEATAASSMPSGTSSFSTC 
rMSELEELFSLFSPAPLLSKLFTSSGSIAICCQDSGPSDTGRLS 
VCQLWIiADSDTGKLSDOQEVVTVGDSGGI*TCPELSLGRM*MSLL 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino a cad 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing 3ignal peptide 
{A-Alanine, CoCysteine, D=Aspartic Acid, JS=» 
Glutamic Acid, F=» Phenyl a lani ne , G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, NsAsparagine, 
P= Proline,' Q=Glutamine, R«*Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /"possible nucleotide deletion, 
\~possible nucleotide insertion) 1 








SSAVI PGYSSSSDSRLNTVPTVDUjCP FQTKSST 


6689 


640 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSALDSSSRTS*STSS 
AEDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFSDSISFCFSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
QRNSLTARQLAMSL* ATKF *RNAC3^PNCLSS KKSAL* LSLNQRF 
GGS AS R KPGNtt S FNSQKCS ALS YCCNFVI KPREVS VSS ENY PAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQOWRRCIjSARDGSRMHiLIiLLLGSGQGP 

WW v urtou irCiI i_iI\_K-£.n;s Jjo JSjt I IJv? V L»I(ji>i> b LWNLiMGNAM VMTQ 

YIRLTPDMQS kqgalwnrvpcfj^rdwblqvhfk IHGQGKKNI>\H 

GDGLAI W YTKDRMQP 


6691 


287 


1401 


lktetseekarrykdrpsqlnavfqeqkkmiqaqesitledVav " 
dftweewqllgaaqkdlyrdvmlbnysnlvavgyqaskpdalfk 

LEQGBQLWT1EDGIHSGACSDIWKVDHVLBRLQSESLVNRRKPC 
nanuj\t* ujn i vhcsK5QFIiLX3QNHDI FDLRGKSLKSNLTLVNQSK 
GYEIKNSVEFTGNGDSFLHANHERLHTAIKFPASQKLISTKSQF 
ISP KHQKTRKLE KHHVCSE OGKAFI KKS WLTDHQ VMHTGEKPHR 
CSLCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKPYICSECGKGFIQKGNLIVHQRIHTGEKPYICNEC 
/ G KG FIQKTCLIAHQR F1ITER 


6692 


178 


939 


w J. lUi(j£.Lfcb 4j WERFCAN 1 1 KAGPMPKH IAF I MDGNRRYAKKCQVE 
RQEGHSQGFNKLAETIiRWCLNIjGI LEVTVYAFS I ENFKRS KSEV 
DGLMDLARQKFSRLMEEKEKLQKHGVCIRVLGDLHLLPLDLQEL 
IAQAVQATKNYNKCFLNVCFAYTS RHE I SNAVREKAWGVEQGLL 
urdi»i*ei&ijjijjs.tiix iWKoi^HJ^UlijiRTSGEVRIiSDFLLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6693 


178 


939 


WIKEGELSLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVE ' 
RQEGHSQG FNKLA2TLR WCLNLG I LEVTVYAFS I ENFKRS KS EV 

I AQAVQATKNYNKCFIiNVCFAYTS RHE I S NAVREMAWG VEQGLL 
DP SD I S ESLLDKCL YTN RS PHPD I L I RTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
EVHSLGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
LAVG PS G CHTEP V FDE VW PSL FLGDAYAARDK ^ in .t ot n t mnm 
NAAAG KFQ VDTGAKFYRGMSLEY YG I EADDNPFFDL5 VYFL P 


6695 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR . 
EVHSLGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
LAVGPSGCHTEP\FDEVWPSLFLGDAYAARDKSKLIQLGITHW 
NAAAGKFQVDTGAKFYRGMSLEYYGI EADDNPFFDLS VYFLP 


669£ 


1 


782 


PRVRGRVGERWAFLSVPAAMSSEMEPLLLAWSYFRRRKFQLCAD 
LCTTQMLEKSPYIX3AAWILKARAIjTE>rVYIDEIDVDQEGIAEMMLi 
DENAIAQVPRPGTSLKLPGTNQTGG PSQ AVRPITQAGRP I TG FL 
RPSTQSGRWTTffiQAiRTPRTAWARPITSSSGPJFVlUiGTASML 
TSPDGPFI^SRLNLTKYSQKPKLAKALIEYIFHHEHDVKTALD 
LAALSTEHSQYKDWWWK/DQIEKCYYRVGMYREAEKQXKSS 




3 


782 


PPLFLRRLNSRALR PGSRKVMA WPASLSGQDVGS FAYLTI KDR 
IPQILTKVrDTLHRHKSEFFEKHGEEGVEAEKKAISLLSKLRNE 
LQTDKPFI PLVBKP VDTD 1 WNQYLE YQQS LLNES DG KS RKF YS P 
WLLV\ECYMYRRIHEAI\IQSPPIDYFDVFKESKEQNFYGSQES 
IIALCTHLQQLIRTIEDLD\ENQLKDEFFKLLQISLWGBISVDL 
SL\SG<5ESSSQNTNVLNSLEDLKPFILLNDMEHLWSLliSNCK 


6698 


668 


754 


VGS CACAGS CKCKECKCTS CKKS ECRAFP 


6699 


325 


492 


EGELP/PARRVLPRAMTASAQPRGRRPGVGVGVWTSCKHPRCV 
LLGKRKGSVGAGSFQLPGGHLEFGErrWBBCAQRETWEEAALHLK 
NVHFAS WNS FI EKEWY^Y^/TI LMKGEVDVTHDSEP KNVEPE KN 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid. F=Phenvlalaninf» rj_nl vrr*S nt* 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R*Arginine, 
S=Serine, TaThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








ESKRIIYNHAFFFQESKWSGGILQ 


6700 


1098 


1392 


TQC WRS STPGMRTHFRTQP / RLECGQGFSQQENGHCMDTNECIQ 
FPFVC PRDK PVCVNTYGS YRCRTNKKCSRGYEPNEDGTAC VERT 
LLLGLCNLLGK 


6701 


2 


1485 


jwvj r K 1 KvKKAAAr iidy e o FSPGLG PTSDKAAAPRTPKRRRLW 
RQRQ/HPAMLCYVTRPDAVLMEVBVEAKANGEDCLNQVCRRLGI 
IEVD Y FGLQ FTGS KGESL WLNLRNR I S QQMDGLAP YRLKLRVKF 
FVEPHb I LQSQTRH I FFLH I KE ALLAGHLLCS PEQAVELS ALLA 
QTKFGDYNQNTAKYNYEELCAKELSSATLNS I VAKHKELEGTSQ 
ASAEYQVLQIVSAMENYGIEWHSVRDSEGQKLLIGVGPEGISIC 
KDD FS P INR IAYP WQMATQSGKNV YLTVTKESGNS IVLLFKMI 
STRAASGLYRAITETHAFYRCDTVTSAVMMQYS RDLICGHLASLF' 
LNENINLGKK YVFD I KRTS KEVYDHARRALYNAGVVDL VSRNNQ 
SPSHSPLKSSESSMNCSSCEGLSCQQTRVLQEKLRKUCEAMLCM 
VCCEEE INS T FCP CGHTVCCES CAAQLQ VGESAAHFCLQPHLS I» 
LLTGSRSQVXiAR 


6702 




/ 1 


PLAKFLKLDLVNVLCLPMEDVFLFYRTCFCSMGLGSSCHLSLPlT" 
RAEALL CSRKATVVRDLVAVRMAEEQE FTQLCKLPAQ PSHPHCV 
NNTYRSAQHSQALLRGIiLALRDSGILFD WLWEGRHIEAHRI L 
LAASCD YPKGMFAGGLKEMEQEEVLIHGVSYNAMCQI LHFIYTS 
ELELSLSNVQETLVAACQLQI PEI IHFCCDFLMSWVDEENILDV 
YRLAEL FDLSRLTE QLDT Y I LKNFVAFSRTDKYRQLPLE KVYS I> 
LSSNRLBVSCETEVYEGALLYHYSLEQVQADQI SLHEPPKLLET 
VR FPLMEAE VLQRLHDKLDP S P LRDTVASALMYHRNES LQPS LQ 
SP01'ELRSDFQCVVGFGGIHSTPS\MSSATRPKYLNPLLGEWKH 
FTAStiAPRMSNOX»IAVLNNFVYLIGGDNNVQGFRAESRCWRYD? 
RHNRWFQIQSLQQEHADLSVCWGRY I YAVAGRDYHNDLNAVER 
YDPATNSWAYVAPLKREVYAHAGATLEGKMYITCGRKGRIT 


6703 


45 


1244 


GVGPRAAAMPLELELCPGRWVGGQHPCFIIAEIGQNHQGDLDVA 
KRM1 RMAXECGADCAKFQKSELEFKFNR KALER P YTS KHS WGKT 
YGEHKRHLEFSHDQYRELQRYAEE VG I F FTASGMDEKAVE FLUE 
LWPFFKVGSGDTNNFPYLBKTAK/TRGWHSVLRDVCGVQLNDE 
TSSWDVl^RVRTSIOeKVLMVLVLDYSGRPMVISSGMQSMDTMKQ 
VYQIVKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIP 

PGELAEliVRS VRLVERALGS PTKQLLPCEMACNEKLGKSWAKV 
KI P3GT ILTMDMLT VKVGEPKG YPPEDI FNLVGK KVLVTVBEDD 
TIM3E 


6704 


92 


1007 


TMNTRNRWNSGLGASPASRPTRDPQDPSGRQGBLSPVEDQRE'G 
LEAAPKGPSRESWHAGQRRTSAYTLIAPNINRRNBIQRIAEQE 
LANLEKWK2QNRAKPVHLVPRRLGGSOSETEWOKOOT.nT ,maq w 
YKQ KLKRE SS VRI KKEAE BAELQ KMKAIQREKSNKLEEKKRLQB 
NLRREAFREHQQYKTAEFL/RQTEHRIARQKCLSKCCLWPTILN 
MGQKLG LQ \DSLKAEENRKLQKM KDEQHQKS ELLELKRQQQEQE 
RAXIHQTEHRRVNNAFLDRLQGKSQPGGLEOSGGCWNMNSGNSW 
GI 


6705 


2 


786 


RLCRNSARVPCGWSASRS LGEGAGFIGPLRG^PkPRAGGTGTS FT 
S YKRKGGIMSTIAAFYGGKSIL ITVATGFLGKELMEKLFRTSPD 
LKVIYILVRPKAGQTLQHRVFQILDSKLFEKVIEVRPNVHEKIR 
AI YADLNQWDFAI SKE DM QELLS CTNI I FHCAATVRFDDTLRHA 
VQLNVTATRQLLLMASQMPKLEAFIHISTAYSNCNLKHIDEVIY 
PCPVE PKKI IDSLEW\LDDAI IDEITPKLIRDWPNI YTYTK 


5706 


130 


531 


FTBSSSSHSQEMW3KLNMLRNDGJIFCDITIRVQDKIFRAHKVVL 
AACSDFFRTKLVGC^DENKNVLDLHHVTVTGFIPLLEYAYTAT 
LS INTENI IDVLAAAS YMQMFS VASTCSEFMKSSILWNTPNSQP 
EK 
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Amino acid segment containing signal peptide 
<A«Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phcnyl alanine, G=Glycine, 
K=Histidine, I«l3oleucine, K=Lysine, 
L=»Leucine, Methionine, N=»Asparagine , 
PaProline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V^Valine, 
W»Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ! 


" 6707 


2233 


1343 


YWSGIGYEfcQHFHWRKFHFEKKGPPSTCQBRLYESRSRWPCIS* 
GMVWG WTAVNGS W * GGQLRCVCVCTSHSSDSTRSSQRASKCHS 
FFILSQ*KT*SSWENV7VFAKYSRIYSYGHSCSKGRGD*DFK*NV 
SQAR*SRFCGLCNPCGHCGLDINLRGGSSPWTDKHSCVHNNLLC 
NRRVFSLIiCEGPSHCYQGAVCREACAAASPGLDSAAEPHRLCEH 
TD * L PK* GPG YI QHFHCDSNI LCI LYNI S FNLFS YS F *GVARYA 
C*RCHWYFEWLLYNHCGDILVACL*RRQL* SSQ 


6708 


115 . 


1729 


T VGS WSRSGRSPPVGRQLLLTG RGAQ AAGS PQGGMALQVE LVPT 
GEI IRVVHPHRPCKIALGSDGVRVTMESALTARDR VGVQD FVLL 
ENFTS EAAF I ENLRRRFR ENL I YT Y I GPVLVS VN P Y RDI* QI YSR 
QHMERYRGVSFYEEPPHLIAVADTVYRALRTERRDQAVMISVES 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNS SRFGKYMDVQFDFKGA? VGGKI LS YLLEKS R WHQ 
NHGERNFH I FYQ LLEGGEEBTLRRLG LERNPQS YLYLVKG QCAK 
VS S INDKSDWKWRKALTVI DFTE DE VE DLLS I AASVLHLGNIH 
FAANEESNAQVTTEKQLKYLTRLLS VEGSTLREALTHRKI IAKG 
B ELLS PLNLEQAAYARDALAKAVYS RTFTKLVG K INRSLASKD V 
ES PS WRSTTVLGLLD 1 YGFE VFQHNS FEQ FCINYCNE KLQQLF I 

ELTI»KSEQEBYEAEGIAWEPVQYFNNKIICDLVEEKFKGII\SI 
LDE\ECLRPGE 


6709 


3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAE~ 

TAAKMEKKVS XRSRKEEEDLEAL IAHFQTLDAKRTQTVELPCPP 

PSPRLNASLSVHPEKDELILFGGEYFNGQKTFLYNELYVYNIRK 

DTHTKVDI PSPPPRRCAHQAVWPQGGGQLWVFGGEFASPNGEQ 

FYHYKDLWVLHLATKTWEQVKSTGGPSGRSGHRMVAWKRQLILF 

GGFHESTPJ5YIYYNDVYAFNLDTFTt/SKLSPSGTGPTPRSGCQ\ 

IPSLPRAASSVYGGYSKQRVKKDVDKGTRHSDMP 


6710 


158 


980 


RHKMTMYR VESS SGRAARKMRLALMGPAFIAAIG YI DPGNFATN 
IQAGASFGYQLLWVVVWANIiMAMLIQILSAKLGIATGKNLAEQI 
RDHYPRPVWFYWVQABIIAMATDLAEFIGAAIGFKLILGV9LL 
QGAVLTG I ATFL I LMLQRRGQKPLEKV IGGLLLFVAAAYI VELI 
FSQPNLAQLGfCGMVIPSLPTSEAVFLAAGVL \GATIMPHVI /YI 

WKSSLTQHLHGGSRQQRYSATKWDVAIAMTIAGFVWLAIMATAA 
SBLNF YGHTG VA 


6711 


3 


347 


VTECKTMT CKMSQLERN I * TM INTLHHYS VKLGHPDTL IHGEFK 

BLWTDLHNILMXBNKNDQAI*HIMEDLDTNAHMQIIFKELIML 
MAMLTWSYHDNMHDADYGPGQQHRPG 


6712 


118 


57 9 


PHGQ KRTRYP. QVRAPGQQPQAQLAMALCIjKQ VFAKDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSWRLPPGENIDDWIAVHV 
VDFFNRINLI YGTMAERCS *TSCP VMAGGPRYEYRWQDBRQYRR 
PAKLSAPRYMALLMDWIESLI 


*713 


2485 


3 


QARGSDSEDGEFE IQAEDDARARKLGPGRPLPTPPTS ECTSDVE - " 
PDTREMVRAQNKKKKKSGGFQSMGLSYPVFKGIMKKGYKVPTPI 
QRKT I PVT LDGKDVVAMARTGS GKTACFLLPMFERI»KIHS AQTG 
ARALI LSPTREIiALQTLKFTKELGKFTGLKTAL I LGGDRMEDQ F 
AALHENPDI I IATPGRLVHVAVEMS UKLQS VE YWPDEADRLFE 
MG PAKQLQE I IAR I » PGGHQTVL FS ATLPKLLVEFARAGLTEPVL 
IRLDVDTKLKBQLKTS FFLVREDTKAAVLLHLLHNWRPQDQTV 
VFVATKHHAE YLTELLTTQRVS CAH I YSALDPTARK INLAKFTL 
GKCSTL I VTDLAARGLD I PLXDNVINYS FPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVDGMLGRVPQSWDEEDSGLOSTLEASLEIjRGLARVADNAQQQ 
YVRSRPAPSPESIKRAKEMDLVGLGLHPLFSSRFEEEELQRLRL 
VXDS IKNYRSRATI FEINASSRDLCSQVMRAKRQKDRKAIARFQQ 
GQQGRQEQQEGPVGPAPSRPALQBKQPEKEEEEEAGESVBDIFS 
BWGRKRQRSGPJJRGAKRRREEARQRDQSFYIPYRPiCDFDSBRG 
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ID 
NO: 


Predicted — 
beginning 
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amino acid 
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^Predicted end 
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location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
£A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H«Histidine, I=Iaoleucine, K=Lystne, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, YoTyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 


6714 






LS ISGEGGAFEQQAAGAVLDLMGDB AQNLTRGRQQLKWDRKKKR 
FVGQSGQEI)KKKIKTBSGRYISSS YKRDLYQXWKQKQKI D* S *L 
GRRRGILTRRRPRTEEVGEARPLAQAGCIPGPIBVPRHPLQAESA 
LELKTKQQILKQRRRAQKAALSLQRWWPQAALCPQ 




169 


1416" 


NNCQBLLPPPPAPMAHIPSGGAPAAGAAPMGPQYCVCKVBLSVS 
GQNLLDRDVTSKSDPFCVLPTENNGRWIBYDRTBTAINNLNPAF 
SKKFVLDYHPEEVQKLKFALFDQDKSSMRLDEHDFLGOFSCSTfl 
TI VS SKKITR PLLLLNDKPAGKGL ITIAAQELS DNRVT TL SLAG 
RRLDKKDLFG KSDPFLEFYKPGDDGKWMLVHRTEVI KYTLDPVW 
KP FT VPLVS L CDGDMEKP IQ VMCYD YDNDGGHDFIGE FQTSVSQ 
MCEARDSVPLEFECINPKKQRKKKNYKNSGI 1 1 LRSCKINRDYS 
PLDYII^GCQLWFTVGIDFTT^SNGNPLDPSSLHYrNPMGTNEYL 
S AI WAVGQI I QD YDSDKMFPALG FGAQLP PDWKVSHE FAINFNP 
TNPFCSGVDGIAQAYSACLP 


6715 


32 


493 


G PAGAESGSLHCLPATVQALAGAAHS PHGGQPPRRGPL IGSGMP 
GKPKHLGVPNGRMVLiAVSDGELSSTTGPQGQGEGRGSSLSlHSL 
PSGPSSPFPTEEQPVASWALSFERLLQDPLGLAYFTEFLKKEFS 
AENVTFWKACERFQOT PASDT 


6716 


1 


176 


GAGGPAPRSFGSEEPRAALERDKMSARAAAAKSTAMEETAIWEQ" 
HTVTLHRVSLCCSK 


6717 


115 


896 


LFAMSGFENLNTDFYQTSYSIDDQSQQSYDYGGSGGPYSKQVAG 
YDYS QQGRFVP PDMMQPQQP YTGQ I YQ PTQAYTPAS PQPFYGNN 
FBDE P PLLE B LG INFDH I WQ KTLTVLHPLKVADGS IMNETD LAG 
PMVF CLAFGATLLLAGKI QFGYVYG I SAIGCLGMFCLLNLMSMT 
G VS FGCVAS VLG YCLLPM I LLSS FAVI FS LQGMVG 1 1 LTAG I IG 
WCS FS ASKI FISALAMEGQQLLYAYPCALXiYGVFALISVF 


671B 


290 


599 


KQS3TVPGTILPSLKWHNSGLCKFPETGGKMTTFKEGL.TFKDVA 
VI FTEEEILGLLDP VQRNL YQDVMLEN FRNLLS VGHHPFKH DVFL 
LE KE KKLDI M KTATQ 


6719 


1 


691 


PTR PEEQDRE DGKCHKMEMN P ISGNLNCD P I AMS Q CS SDHG CET 
DLD5DDDKIEKPNNFMKDSAS QDNGLSRKTSR KR VCSSDSDSSL 
QWKKSS KARTGLLRITRRCAATAAN KI KLMS DVEDVSLENVHT 
RSKNGRKKPLHLACTTAKKKLSDCEGSVHCEVPSEQYACEGKPP 
DPDSEGS TKVLS QALNGDSDS EDMLNS EHKHRHTNI H KIDAPS K 
RKSSSVTSSG 


6720 


3 


822 


HE VAEEAGGTVYPQRGTMPGTKRFQH V I ETPE! PGKWELTGYEAA 
VP I TEKSNPLTQDLDKADAEN I VRLLGQCDAE I FQ E EGOALS TY 
QRLYSESILTTKVOVAGKVQEVLKEPDGGLVVLSGGGTSGRMAF 
LMS VS FNQLMKGLGQKPL YT Y L r AGGDRS WASREGTEDS ALHG 
IEELKKVAAGKKRVI VIGI S VGLSAP FVAGQMDCCMNNTAVFLP 

VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYQITSLLFSM 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTMPGTIOiFQHVIETPEPGKWELTGYEAA - " 
VP I TE KSNPLTQDLDKADAEN I VRLLGQCDAE I FQEEGQALSTY 
V*uj xi> ai> lui xwvy VA^KVQEVLKEPDGGLVVLSGGGTSGRMAF 
LMSVS FNQLMKGLGQKPLYTYL1AGGDRS WASREGTEOSALHG 
IEBLKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVFLP 
VLVGFNP VSMARHPFPPPR ILRSLTVFPSLRAPHYQI TSLLFSM 
SWTLISE 


6722 


X 


390 


RSWSKRTWQAi.PI^Vi.^LLFLCGTPQAAnNMQAIYVALGEAVE " 
LP CPSPSTLHGDEHLS W FCS PAAGS FTTLVAQ VQVGRPAPDPG K 
PGRESRLRLLGNYS LWLEGS KEBDAGRYWCAVLGQHHNYQNW 


6723 " 

1 


173 


659 


VCQYCTARMAD FG I S AGQFVAWWDKS S P VEALKGLVDKLQALT 
GNEGRVS VENI KQLLQSAHKES SFDI ILS GLVPGSTTLHSABIL 
AE I AR I LRPGGCLFLKEP VETAVDNNSKVKTASKLCSALTLSGL 
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6725 



Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



TtT 



"356" 



6726 



98 



6727 



6728 



486 



T729- 



"259- 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



7X4 



Amino acid segment containing signal peptide 
<A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutaraine, R«^rginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
^-possible nucleotide insert ion) 
VB VKELQR E PLT PEEVQS VREHLGHESDNL 



VCQ YCTARMADFG I S AGQ F VAVVWdKS S P VKALKGLVDKLQALT 
GNEGRVSVENIKQLLQSAHKESSFDIILSGLVPGSTTLHSAEIL 
AE I ARILRPGGCLFLKEPVETAVDNNSKVKTASKIiCSALTLSGL 
VEVKEIiQREPLTPEEVQS VREHLGHESDNL 



kkrtppvilatmdddlmlalrlqeewwlqkaerdhaOeslslvd 

ASWELVDPTPDLG^LFVQFNDQFFWGQLBAVEVKWSVliMTLCAG 
I CS YEGKGGM CS IRLS5 PLLKLR PRKDLVE VFFV 



831 



HLQKMERKINKREKBKEYEGKHNSLEDTDQGKNCKSTLMTLNVG 
G YL YI TQKQTLTKY PDTFLEGI VNGKILCP FDADGHY FI DRDGL 
LFRHVLNFLRNGELLLPEGFRENQLLAQEAEFFQLKGLAEEVKS 
RWE KEQLTPRETTFLE I TDNHDRS QGLRI FCNAPDFI S KI KSR I 
VLVSKSRLDGFPEEFSISSNIIQFKYFIK 



"935" 



1191 



FRGMGDERPHYYGKHGTPQK YDPTFKGPI VMR4CTDI I CCVFLL 
LAIVGYVAVGIIAWTHGDPRKVIYPTDSRGEFCGQKGTKNENKP 
YL PYFWI VKCAS PLVLLB FQCPTP Q I CVEKCPDR YLTYLNARSS 
RDFE YYKQFCVPGFKNNKG VAEVLRDGDCPAVLI PSKPLARRCF 
PA IHAYKG VLMVGNETTYEDGHGSRKNI TDLVEGAKKANGVLEA 
RQLAMRIFBDYTVSWYWDI IS LGI AMAMSLLP I ILLRPLAG I MG 
RGM1 IMGILVLGY 



6730 



784 



6731 



102 



1015 



FCSS wlrsladsslswkmflvgLtgg iasgkssviqvfqqlgca " 

VI DVDVMARHVVQ PGYPAHRRI VEVFGTEVLLENGDINRKVLQD 
LI FNQ PDRRQLLNAI THPEIR KEMMKETFKYFLRE PRTS PRGKK 
HVPS ALKEaDS LMRR DT 

VGLTGAQSGRTASMGRDQRAVAGPALRRWLLLGTVTVGFLAQSV 

LAGVKKFDVPCGGRDCSGGCQCYPEKGGRGQPGPVGPQGYNGPP 

GLQGF PGLQGR KGDKGERGAPGVTG PKGDVGARG VSGFPGADG I 

PGH PGQGG PRGRPG YDGCNGTQGDSGPQGPPGS EGFTGP PG PQG 

PKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQM 

GPVGAPGRPGPPGPPGPKGQQGNRGLGFYGVKGEKGDVGQPGPN 

GIPSPTLHPIIAPTGVTPHPDQYKGEKGSEGEPGIRGISLKGEE 
GIK 



446 



1205 



6733 



"SIT" 



NMVDYYEVLGLQR YASPED IKKAYHKVALKWHPDKNPBNKKKAB 
RKFKEVA2AYBVLSWDEKRDIYDKYGTEGLWEF 



G A KKRLHGAWPRVEVGCP W ETRJSSEG VHLER PTS PL KNNPEGS 
LDIYAGLDSAVSDSASKSCVPSRNCLDLYEEILTEBGTAKEArY 
NDLQ VE YGKCQLQMKELMKKF KE IQTQNFS LINENQS LKKNI S A 
LIKTARVEINRKDEEI 



TSTT 



-6734*- 



169 



GRWQRRPPPPSPPLWCLQPGGGSDPQQLTQLRHCLSHSPQDTPW 
AQRQVCYTAATTQAAAPATRNCLPDKSGHRPTPPRSHRHHRQEN 
LGSIKPSSRSTKATSTTMAGDGRRABAVREGWGVYVTPRAPIRE 
GRGRLAPQNGGSSDAPAYRTP PSRQGRREVR FSDE PPE VYGDFE 
PLVAKERS PVGKRTRLEE FRSDS AKEEVRESAY YLRS RQRRQPR 
PQ5TEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSBE 
DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 

YEATSVQQKVNFSEEGETEEDDQDSSHSSVTTVKARSRDSDESG 
DKTTRSSSQYIESFW 



RSCRQVGMRSR NQGGESASDGHISCPkPS^G^AGEKSLSEDAk - 
KKKKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLBLSKEDLI 
QLLSIMEGELQAREDVI HMLKTE KTKPBVLEAHYGSAE PEKVLR 
VLHRDAILAQEKSIGEDVYEKPISELDRLEEKQKETYRRMLBQL 
LI^KCHRRTVYELE^rEKHKHTDY^I^^XSDDFTNLLEQERERLKK 
LLEQBKAYQARKE 



SAAMFPVFSGCFQELQEKNKSLELVSF EBVAVHFTWEEWQDLDP 
AQRTLYRDVMIjET YS SLVS LGHCI TKPEMI FKLEQGAEPWIVBE 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

dltl^inj dClu 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Iiysine, 
L=Leucine, M=Methicnine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 
TLNLRLSGGSKKQVFSGICHRSLVELQKVHbV 


" 6735- 


280 


558 


KSRRAGVTKMSNPFLKQVFNKDKTFRPKRKFEPGTQRFELHKKA 
QAS LNAGLDLRLAVQL P PGEDLNDWVAVHVVDFFNR VNLIYGT I 
XDGCT 


6736 


195 


808 


fWYEI^FKRlSMPNIKSLGLTNLNFI^KRI^SVLPLITDYVyFEN 
SSSNPVLIRRIEELNKTASGNVEAKWCFYRRRDISNTLIMLAD 
KHAKEIEBESETTVEADLTDKQKHQLKHRELFLSRQYESLPATH 
IRGKCSVALLNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 
E I R VGPR YQAD I PEMLL EGTFFCVFAVL 


6737 


150 


1209 


PVIMPLHFii i'CiDl VRPSCCVSSS PKLRKNAHSRLESYRPDTDLS 
REDrGCNLQHISDRENIDDLNMEFNPSDHPRASTIFLSKSQTDV 
REKRKSLFINHHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTIK 
CVALAIYYHIKNRDPDGRMLLDIFDENLHPLSKSEVPPDYDKHN 
PEQKQIYRFVRTLFSAAQLTAECAIVTLVYLERLLTYAEIDICP 
ANW KR I VLGAI LLASKVWDDQAVWNVD YCQ I I*KD I TVEDMNELE 
RQFLELLQFNINVPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 
RAHKLEAISRLCSDKYKDLRRSARKRSA5ADNLTLPRWSPAIIS 


6738 


148 


653 


CACAEQPARAEVGAATALPVRWASGEMAPSG3LAVPIiAVLVLLL 
WGAPWTHGRRSNVRVI TDENWRELLEGDWMI EFYAP WCPACQNI* 
QPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHC 
KDGEFRRYQGPRTKKDFINFISDKEWKSIEPVSSWF 


6739 


3 


631 


SWPDMAEEEVAKLEKHLMIJiRQEYVKLQKXIAETEKRCAlXAAO 
ANKESSSESFISRLLAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VIAARSDSWSIJ^LSSTKEIJ)LSDAKPEOTMTMLRMIYTDELEF 
REDDVFLTELMKLANRFQLQLLRBRCEKGVMSLVNVRNCIRFYQ 
TAEBLNASTLMNYCAEIIASHWVSEVEGVNKAL 


6740 


3 


631 


SWPDMAEEEVAKLEKHI^LLRQEWia^KKLAETEKRCALLA^ 
ANKESSSESFISRLtAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VIiAARSDS WSLANLSSTKE LDLS DAN PEVTMTM LR WI YTDELEF 
REDDVFLTELMKIJ^FQLQLLRERCEKGVMSLVNVRNCIRFYQ 
TAEELNASTLMNYCAEI IASHWVSEVEGVNKAL 


6741 


141 


960 


PLTL P FS SRARAGHTMNTS PGT VGSDPVI LATAG YDHTVRF WQA 

HSG I CTRTVQHQDSQVNALE VTPDR5MI AAAVQ P VSLG YQH I RM 

YDLNSNNPNP1 ISYDGVNKNIASVGFHEDGRWMYTGGEDCTARI 

WDLRSRNLQCQRIFQVNAPINCVCLHPNQAELIVGDQSGAIHIW 

DLKTDHNEQLIPEPEVSITSAHIDPDASYMAAVNSTLVPFSCLL 

PLAIGILQEGEFESLARRGLIjFLACQGNCYVWNLTGGIGDEVTO 
LIPKTKIP 


6742 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVIIATAGYDHTVRFWQA 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 

ydlnsnnpnpiisydgvnkniasvgfhedgrwmytggedctari 

WDLRSRNLQCQRI FQVNAP INCVCLHPNQAELI VGDQSGAIHI W 
DLKTDHNEQL IPE P E VS I TSAHIDP DAS YMAAVNSTL VP FS CLL 

plaigilqegefeslapj?gllflacqgncyvwnltggigdevto 

LIPKTKIP 


6743 


1 


412 


MHSTQDKS LHLEGD PNPSAAPTSTCAPRKM PKRI S ISKQLASVK 

ALRKCSDLEKAI ATTALI FRNS SDSDGKL3KAIAKDLLQTQ FRN 

FAEGQETKPKYREIIiSELDEHTENKLDFEDFMILLLSITVMSDL 
LQNIR ' 


6744 


95 


1343 


R TPARNR CAu L'E VLS RFSS PNKASSFALQSAGGGLPAVRALRRD " 

RQKVSTVGYGMDEVEQDQHBARLKELFDSFDTTGTGSLGQEELT 

DLCHMLSLEEVapvLQQTLLQDNLIiGRVHFDQFKEALILILSRT 

LSNEEKFQEPDCSLBAQPKYVRGGKRYGRRSLPEFQESVEEFPE 

VTVIEPLDEEARPSHIPAGDCSEHWKTQRSEEYEAEGQLRFWNP 

DDLNASQSGS SPPQD W I BEKLQEVCEDLGI TRDGHLNR KKLVS I 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A»Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, " 
L»Leucine, M=Methionine, N=Asparagina, 
PsProline, Q=Glut amine, R=Arginine, 
S=Serine, T=*Threonine. V=Valine, 
w -Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CEQYGLQNVDGEMbEB VFHNLDP DGTMS VE DFF YGLFKNGKlS Clf" 
PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVBRILDTWQEEG IENSQBILJCALDFGLDGNIKLT2L 
TLAtiENELLVTKNS IHQACI 


6745 " 


1 


5B8 


TFRDQGWAQRRRWLLGCASWESWEAAIAAGPGLPSSTARQQNNP " 
AAGTEC FAAVWARGTAMGS VLSTDSGKS APASATARALERRRD P 
ELPVTSFDCAVCIiEVLHQPVRTRCGHVFCRSClATSLKNNKWTC 
P YCRA YL PS EGVPATDVAKRMKSE YKNCAECDTLVCLSEMRAH I 
RTCQKYIDKYGPLQELEETA 


6746 


110 


492 


GATGAMAESAPARHRRKRRST?PtiTSSTLPSQATEKSSYFQTTEI ' 

SLWTVVAAIQAVEKKMESQAARLQSLEGRTGTAEKKLADCBKMA 

VEFGNQLBGKWAVLGTLLQEYGLLQKRLENVENLLRNRN 


i 6747 


247 


484 


EAVTFKDVAWFTEEELGLLDIiAQRKIiYRDVMLENFRNLLSVGH'" 
QPFHRDTFHFLREEKFWMMDIATQREGNSVYAGVC 


6748 


201 


665 


MTTFKEAVTFKDVAWFTEEELGLlibPAQRKLYRDVMLENFRNL 
LS VGNO P FHODTFH FLG KE KFWKMKTT<? r>B priMcnr'Tr t r» t rr m-c*p 
VPE AGPHEEWSCQQ I WE Q IAS DLTRSQNS I RNS SQ FFKEGDVPC 
Q I EARLS I SXVQ QXPYRCNECKQ 


6749 


95 


719 


RRE VKQGDGVCPRARGS PQSQQFPSCAGGGEGLQQSGEALDGAM " 
S AGGPC PAAAGGG PGGAS CSVGAPGGVSM FRWLE V1»E KE FDKAF 
VDVDhhLGE I DPDQAD I TYEGRQKMTSLSSCFAQhCRKAQSVSQ 
INHKLEAQLVDIiKSKIjTEIXJABKWLEKEVHDQLLQLHS iqlql 
HAKTGQSADSGTIKAKLSGPSVEELERELKAN 


6750 


3 


428 1 


SCESRRPGAIG^VWASGALPRDTTGLGSEQPSGDVAQSNRATMGT" 
TAPGP1HLLELCDQKLMEFLCNMDNKDLVWLBEIQEEAERMFTR 
EFSKEPELMPKTPSOKNRRKFG3WT^wnn'srMD"nDTfjDT>T c-oovo 
RSSQLSSRR 


6751 


152 


1417 


PTKATEMAGASVKVAVRVRPFNSREMSRDSKCIIQMSGSTTTIV 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEGYNVCIFAyGQTGAGKSYTMMGKQEKDQQGII PQLCEDL 
FSRINIOTNDNMSYSVEVSYMEIYCERVRDLLNPKNKGNLRVRE 
HPLLGPYVEDl^KLAVTSYNDIQDLMDSGNKARTVAATNMNETS 
SRSHAVFNI I FTQ KRHDAETNITTE KVSKI S LVDLAGSERADST 
GAKGTRLKEGANINKSLTTLGKVISALAEWDSGPNKNKKKKKTD 
FI PYRDSVLTWLLRENLGGNSRTAMVAALSPADINYDETLSTLR 
YADRAKQ I R CNAV INED PNNKLI R E L KD EVTR LRDLL YAQGLGD 
I TDMTNALVGMS PS SSLSALSSRNV 


6752 


24 


1834 


RNCVPPLGCYRSRVKFHSDIKMQYSHHCEHLLERLNKQREAGFL " 

CZDCTIVrGEFQPKAHRNVLASFSBYFGAIYRSTSENNVFLDQSQ 

VKAIXJFQKLLBFIYTGTLNLDSWNVKEIHO^^YLK^EVVTKC 

KI KME DFAF IAN P SSTE I S S I TGN IE LNQ QTCLLTLRD YNNRSK 

S B VSTDLIQANP KQG ALAKKS S QTKKK KKAFN S P KTGQN KT VQ Y 

PSDILENASVELFLDANKIiPTPWEOVAQINDNSELELTSWE^ 

TFPAQD I VHTVTVKRKRGKSQPNCALKEHSMSNIASVKS PYEAE 

NSGEELDQRYSKAKPMCNTCGKVFSEASSLRRHMRIHKGVKPYV 

CHLCGKAFTQCNQLKTHVRTHTGEKPYKCELCDKGFAQKCQLVF 

HSRMHHGEEKPYKCDVCmQFATSSNLKrHARKHSGEKPYVCDR 

CGQRFAQASTLTYHVRRHTGB KPYVCDT CGKAFAVS S S L I THS R 

KHTGEKPFICELCGNSYTDI KNLKKHKTKVHSGADKTLDSSAED 

HTLSEQDSIQKSPLSETMDVKPSDMTLPLALPLGTEDHHMLLPV 

TDTQSPTSDTLLRSTVNGYSEPQLIFLQQLY 


6753 


■ 2 


1305 


VPSLP YP PQKWAHTE FTTSSDSETANGI AKPDP VMPGGE BKAS ' 
PFG IKLRRTN YSIiRFNCDQQABQKKKKRHSSTGD S ADAGP PAAG 
SARGEKEMEGVALKHGPSLPQErKQAPSTRRDSAEPSSSRSVPV 
AHPGPPPASSQTPAPEHDKAAWKMPLAQKPAIiAPKPTSQTPPAS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

ho f i v-of- 

amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corre aponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptioe~~ 
(A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«. Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K= Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q*Glutamine, R=Arginine, 
S=Serine, T«Threonine, VaValine, 
W=Tryptophan, YaTyrosine, XoUnknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\aposaible nucleotide insertion) 








PLSKLSRPYLVELLSRRAGRPDPBPSBPSKEDQESSDRRPPSPP 
GPEERKGQKRDEBBBATERKPASPPLPATQQEKPSQTPBAGRKE 
KPMLQSRHS LDGS KLTEKVETAQPLWI TLALQKQKGFRBQQATR 
EER KQAREAKQAEKLS KENVS VS VQ PG SSS VSRAGSLHKSTALP 
EEKRPETAVSRLERREQLKKANTLPTSVTVEISYSSPAAPLVKB 
VSKRFSSPDDAPVSSEPAWLALAKRKAKAWSDCPLIIK 


6754 


2 


413 


F VRRRRRRLGG P E VNTMS SLHKS RIADFQDVLKEPS I ALEKLRE 
LS PSG1 PCEGGLRCLCWKI LLNYLPLERASWTS ILAKQRELYAQ 

PLREMIIQPGIAKANMGVSREDVTFEDHPLNPNPDSRWNTYFKD 
NEVLL 


67S5 


298 


1343 


PGLOLOVALJSADWFiDMPGGRRGPSRQQLSbSALPSLQTLVGGG 
CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIFDLPVDGS 
LLFE FLFFI YLLVALF 1 QY IN I YKTVWWYPYNHPAS CTS LNFHL 
ID YHLAAFIT VMLARRLVWAL1 SEAT KAGAAS M I H YMVL I S ARL 
VLLTLCGWVLCWTLVNLFR3HSVLNLLFLGYPFGVYVPLCCFHQ 
DSRAHLUjTDYNYVVQHEAVEESASTVGGLAKSKDFLSLLLESL 
KEQFNNATP I P THS CPLS P DL IRNE VECLKADFNHR IKE VL FNS 
LFSAYYVAgLPLCFVKVSGYLTFMCFLDLCVNYINWVFLV 


67S6 
6757 


180 


754 


ieralgslplsipvswgslrtlkyqqqplrpkvllcqtrvqchd 

LRS LQ PQ P PGLKQS FGLR VX.GLQTGATTPQ LRD LT CKEL 1 1 LTE 

reaqkrkkrkekbsgmaltqgpltfrdvaiefsqeewksldpvo 

KALYWDVMLimRNLVFLGKDNFALEVKICPRVFLYFLCCLSWE 
PFHYLTETEALLTHK 




2 


459 


nsrveapsahsresqgsdamrkhlswwwlatvcmllfshlsavq - 
trgikhrikwnrkalpstaqiteaqvaenrpgafikqgrkldid 
fgaegnryyeanywqfpdgihyngcseanvtkeapvtgcinatq 
aanqgb fqkpdnklhqqvlw 


6758 
6759 


1 


1008 


ABGPKX*PGRRFRDRAPWIjPARt,LRGVLAVWVSLSALGPGSFCRR 
RVPSIiAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPs 
LPPSFRRNMANNSPALTGNSQPQHOAAAAAACQQQQCGGGGATK 
PAVSGKQGNVLPLWGNEKTMNLNPMILTNILSSPYFKVQLYELK 
TYHE WDE 1 Y FKVTHVEP W E KGSRKTAGQTGMCGG VRG VGTGG I 
VSTAFCLLYKLFTLKLTRKQVMGLITHTDSPYIRALGFMYIRYT 
QPPTDLWDWFESFLDDEEDLDVKAGGGCVMTIGBMLRSFLTKLE 
WFSTLFPRIPVPVQKNIDQQIKTRPRKI 




1 


513 


RKHNFHSLDGTSTRAFHPQTGLPLLSSPVPQRKTQSGCFDLDSS 
I»LHLKSFSSRSPRPCLNIEDDPDIHBKPFLSSSAPPITS1*SLI,G 
NFEES VLNYRFDPLGI VDG FTAJBVGASGAFCPTHLTL PVEVS FY 
S VSDDNAPSP YMG VI TLBSLGKRG YRVP PSGTIQWCVL 


6760 
676^1 


239 


606 


VLiSKKKGLSAEEKRTRMMEIFSETKDVFQLKDLEKIAPKEKGIT 
i^SVKEVLQSLVDDGMVDCERIGTSNYYWAFPSKALHARKHJCLE 
VLESQLSEGSQ KHAS LQKS I EKAKIGRCETEERT 




29 


1733 


ERTLRGLREVAAPSDVADAAVSRRGRCCCCLHCTQTQVAQDCPS 
SSS S VQRCELS LFQ3 LHTMTS KKLVNS VAGCADDALAGLVACNP 
\jn n. v rtuna u±>u&ui\.ysn vALi Ltb ovjUSGHEPAHAGFIGKG 
MLTGVIAGAVFTSPAVGSILAAIRAVAQAGTVGTLLIVKNYTGD 
RLNFGLAREQARAEGIPVEMWIGDDSAFTVLKKAGRRGLCGTV 
^ KKVAGALAEAGVGLEE I AKQVNVVTKAMGTLGVSLSSCSVPG 
SKPTFELSADEVBLGLG IHGEAG VRR I KMATADE I VKLMLDHMT 
NTTNASH V P VQPGS S WMMVWWLGGLS FLELGI IADATVRSLEG 
RGVKIARALVGTFMS ALEMPG I S LTLLLVDEP LLKL IDAETTAA 
AW PNVAAVS I TGRJCRSR VAPAE PQEAPDSTAAGGS AS KRMAL VL 
BRV{^TLLGLEEHLNALDRAAGDGDCGTTHSRAARAIQEWLiCEG 
?PPASPAQLLSKLSVLLLEKMGGSSGALYGLFLTAAAQPLKAKT 
3LPAWSAAMDAGLEAMQKYGKAAPGDRTMLDSLWAAGQEL 
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SEQ" 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

corre soond i no 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
<A=Alanine, C=»Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, P- Phenyl alanine, G=Glycine, 
HtsHistidane, I»Isoleucine, KsLysine, 
L»Leucine, M=Methionine, N=Asparagine , 
PeProline, Q=Glutamine, R^Arginine, 
S*Serine, T-threonlne, V=Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 

\ f*k Ptf^kl A mini a » J^. ■ . • k 

\=possioie nucleotide insertion) 




3 


613 


ASTISWRLCVAGABARRPVPVAGEJIAGGGAMWFMYLLSWLStl'l 
QVAP ITLAVAAGL Y YIAEL IEE YTVATSRI I KYMIW FSTAVL IG 
LYVFERFPTSMIGVGLFTNLVYFGLIjQTFPFIMLTSPNFIE*SOG 
LVWNHYLAFQFFABEYY P FSEVLAYFTPCLW I IPFAFFVSLSA 
GBNVLPSTMQPGDDWSNYFTKGKRGK 


6763 


2 


760 


SGPDFPGRRFRGCCCVRPPAGAGMELGGHWDMNSAPRLVSETAE 
RKQEQKTGTEAEAADS GAVGARRFLLCL YLGGFLDLFGVS M WP 
LLS LHVKS LGAS PT VAG I VGS S YGILQL FS S TLVGCWS DWGRR 
SSLLACILLSALGYLLLGAATNVFLFVLARVPAGIFKHTLS ISH 
ALLSDWPEKERPLVIGHFNTASGVGFILGPWGGYJ.TELEDGF 
YLTAF I CFLVFI LNAGLVWFFPRREAKPGSTE 


5 764 


80 


43 8 


LKKMDTMMLS VRN LF EQLVRRVE I LS EGNE VQF I Q LA KO FEDFR ' 

KKWQRTDHELGKYKDLLMKAETERSALDVKLKI1ARNQVDVEIKR 

RQRAEADCEKLERQIQLIREMLMCDTSGSIQ 


6765 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFLLQFPSGPSRHFLAACVARWL"" 
RGSVLVSEALSGSAKDGIVTEVAVGVKRGSDELLSGSVLSSP.VS 
NMSSMWTANGNDSKKFKGSDKMDGAPSRVLHIRKLPGEVTBTE \ 
VIALGL PFG KVTNILMLlGGKKfQAFLEI*ATEEAAITNGirYYS AVT • 
PHLRNQ ! 


6766 


1 


1287 


EGGS F KAS LT WLWPLGEMKLHCE VE V I SRHL PALGLRNRGKG VR 5 
AVLSLCQQTSRSQPPVRAFLLISTLKDKRGTRYEliRENIEQFFT \ 
KFVDEGKATVRLKEPPVDICLSKANSSSLKGFLSAMRLAHRGCN j 
VDTPVSTLTPVKTSEFENFKTKMVITSKKDYPLSKNFPYSLEHl, ' 
QTSYCGLVRVDMRMLCLKSIiRKLDLSHNHIKKLPATIGDLIHLQ 
ELNLNDNHLBS FS VALCHS TLQKS I»WS LDLS KNKI KALP VQ FCQ 
IOBLOLKLDDNELlQFPCKIGQLINLRFLSAARNKriPFLPqPF 
RNLSLEYLDIiFGNXFEQPKVLPVI KLQAPLTLLES5 ARTI LHNR 
I P YGSH X I PFHLCQDLDTAK1 CV CGRFCLNS F IQGTTTMNLHS V 
AHTWLVDNLGGTEAP I ISYFCSLGCYWSSDI 


6767 


336 


913 


APMICLCSSDLQFRYKEAFLRDRGLQIGYCSVDDDPRMKHFLNV 
GRLQSDNEYKKDFAKSRSQFKSSTDQPGLLQAKRSQQLASDVHY 
RQPLPQ?TCDPEQIjGI,RHAQKAHQLQSDVKYKSDLNLTRGVGWT 
f t^i> i AVhjviAKKiUiiiJuANARGLGLQ^ 
NPDATE ILHVKKKKALLL 


6768 . 
6769 


2 
284 


363 
396 


PGST1S C YLLSEGSLPLCMQVACGEEKHRAPTMKTLRAR FKKTS 
LRLS PTDLGS CPPCGPCP I PKP AARGRRQSQDWG KS DERLLQAV 
ENNDAPRVAALIARKGLVPTKLDPEGKSAFHL 


6770 


1 


397 


MSTPDFSTAENNQEUANEVSCIoKAMLTLMLQAMGQAb 
gRWYQVIWSSTMAXLHDVYKDE^^KI^TEFKYNSVMQVPRVEK 
IOXNMGVGEAIADKKLLDNAAADrj^SGQKPLITKARKSVAGF 
KIRQGYPIGCKVTLRGERMWEFFERLITIAVPRIRDFRGLSAKS 


6771 


3 


3 78 


APAGTLAMTGKSVKDVDRYQAVLANLLLEEDNKFCADCQSKGPR 
WASWNIGVFICIRCAGIHRNLGVHISRVKSVNLDQWTQEQIQCM 
QEMGNGKANRLYEAYLPETFRRPQIDPYLFWSNLEG 


6 772 


1 


1400 


AAAFLQGMt VNGF INTV1 TS L \ ERR YDLHS YQS GL IAS S YD I AA 
CLCLTFVS YFGGSG \ HKPRW LG WGR \ VLMGTGSLVFAL PHFTAG 
P* *GWKliDAGVRTCPANPR\ P VCAG\HTSGLSRYQLVFMLGQFL 
HGVGATPLYTIiGVTYLDENVKSSCSPIYTAIFYTAAILGPAAGY 
LIGGALLNIYTEMGRRTELTTESPLWVGAWWVGFLGSGAAAFPT 
AVPILG Y PRQL PGSQR YAVMRAAEMHQLKDSSRGEASNPDFGKT 
I RDLPLS I WLLLKNP TFILL CLAGATEATL ITGMSTFS PKFLES 
Q PSLS ASEAATLFG YLVVPAGGG5TFLGGFFVNKLRLRGSAV I K 
FCLFCTWS LLGI LVFSLHC P S VPMAGVTAS YGGSLL PEGHIiNL 
TAPCNAACS CQPEHYSPVCGSDGLMYFS LCHAGCPAATE'nJVDG 
QKVYRDCS CIPQNLSSGFGHATAGXCTST 
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ID 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, (^Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine , G=Glycine. 
n-rtisciame, x=»isoieucine, K=Lysine, 
L=I>eucine, M=Methionine, N=Asparagine, 
r-rroune, y-biutaraine, R»Arginine, 
S=Serine, T=Threonine, V»Valine, 
W-Tryptophan, Y«Tyrosine, X^Unkinown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possip.Le nuceotide insertion) 


6773 


1 


630 


PWEAPKEHKYKAEEHTVVLTVTGEPCHFPFQYHRQLYHKCTSieg - 
RPGPQPWCATTPNFtlQDQRWGYCLEPKKVKDHCSKHSPCQKGOT 
CVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLLRFFHKNEIWYRT 
EQAAVARCQCKQPDAHCQRLASQACRTNPCLHGCRCLEVEGIIRL 
CHCP VG YTGPFCDVGE * GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTELSDQQYr^FFILSS/WVPTFLSMDVDGRVIKADSFSKIISS 
GLRIGFLTGPKPLIBRVILKIQfVSlljHPSTFNQIiMISQ 


6775 


104 


614 


TCPSQLRVLTARGGRRAPSPQLWTLVLA^IEBKWRSHRILRMNS 
GRPETMENLPALYTI FQGB VAMVTDYG AF I KI PGCRKQGLVHRT 
HMS S CRVD K PSEI VDVGDKVWVXL IGREM KNDR IKVS LS MK WN 
. QGTGKDLDPNMV\SLSKKRGGGDP3RITLGRRSPLRLS 


6776 


3 


1108 


HERHERHBGAbSQDALLRISIPLDSNMRPEKCKRFVHPQWOLLH 
{ LNGTFPKTSDADMBPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
j TSVAKFVFMAGMMVGGILGGHLSDRFGRRPVLRWCYLQVAIVGT 
j CAALAP'TFliIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFOAM 
J GITU3MCPSGIAFMTLAGLAFAIRDWHILQLWSVPYFVIFLTS 
SWLLESARWLI INNKPEEGLKELRKAAHRSGMKNARDTLTLEIL 
KSTMXKE LEAAQKKKPFLGERLHMPNI CKRI SLLPFTKFANFMA 
Y FGLNLHG/ L KHLGNlfVFIjLQTLFGAV /TP PGQLVLHLGHWGSG 
RVSS RGRVNCLGLFVLQVW 


Sill 


779 


63 


CFFHGPAWRDCEVRATFAKKQGQSGIISCIAFSPAQPLYACGSY 
GRSLGLYAWDDGSPLALLGGHQGGI THLCFHPDGNRFFSGARKD 
AELLCWDLRQSGYPLWS LGREVTTNQRI YFDLDPTGQFLVSGST 
SGAVSVWDTDGPGNDGKPEPVLSFIiPQKDCTNGVSLHPSLPLLG 
H CI> P VS VCFLS PTESGGRRRGAGPS LGS PRRHVHLECRLQLWW C 
GGGARLQHP+ +SPRARKGR 


6778 


311 


805 


iqsitdesrgsirrknpantrlrlnvp\bbtagdse/erspeeb 

VQADPRI RS AS PKCPTSSPFPKGRS PEG EGET\ D PEKVHFHPG P 
KDKSVAEKN\KGP\SPVSSEGIKDFFSMKPEWENLNQSNVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 


6779 


2 


535 


RAJ^RQPRLIJUL^GIEPESMAISEPIKGSRKPCVNKEELALKKP 
MAKCAWKGPREPPQDARAEAES PGGASESDQDGGHES PPKKKAV 
AWVSAKNPAPMRKKKKVSLGPVS YVLVDSEDGRKKPVMPKKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGS PRRATNESRK 
V 


6780 


3 


403 


HE VNDNKPEININLKS PGKEEIS YI FBGDP IDTFVALVRVQDKD 
SGLNGEIVCKLHGHGHFKLQKTYENNYLILTNATLDREKRSEYS 
LTVI AEDRGTP S LSTVKHFTVQ IKD I NDNP PHFQRSR YEFVI3B 
K 


6731 


1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPELSEVS 
SNVAPSIPPVMSRPVSSSSISTPLPPNQITVFVTSNPITTSANT 
SAALPTHLQSAI^STVVTMPNAGSIC/MVSEGQSAAQSNARPQFI 
XWFINSSSIIQVMKGSQPSTIPAAPLTTNSGLMPPSVAVVGPL 
HIPQNIKFSSAPVPPNALSSSPAPNXQTGRPLVLSSRATPVQLP 
SPPCTSSPWPSHPPVQQV.KELNPDEASPQVNTSADQMTLPSSQ 
STTMVSPLLTNSPGSSGNRRSPVSSSKGKGKVDKIGQILLTKAC 
KKVTGSLBKGEEQYGADGETEGQGLDTTAPGLMGTEQLSTELDS 
KTPTPPAPTLLKMTSSP VGPGTASAGPSLPGGALPTSVRS I VTT 
LVPSELI SAVPTTKSNHGGI ASESIAG 


6782 


3 


1327 


RKPTVIRIPAKFGKCLKEDPOSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPR LPPRpVNGKTI PTQQ PPTK 
VPPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 
KAKS QVFKNQDP VLPPR PKPGHPLYS KYMLSVPHG r ANED TVS Q 
NPGELSCKRGDVLVHLKOTEWJYXjECQKGSDTGRVHIiSQMKLIT 
PIiDEHLRSRPNPFSPPKAPSHAQKPVDSGAPHAWLHDFPAEQV 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histidine, I*Iaoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T»Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X-Unknovn, **Stop 
Codon, /opossible nucleotide deletion, 
V»possible nucleotide insertion) 








DDLNLTSGE I VYLLB KIDTDWYRGNC RNQIG I FPAN YVKV* 1 1 DI 
PEGGNGKRECVSSHCVKGSRCVARPEYIGBQKDELSFSEGEIII 
LKEYVNEEWARGE VRGRTGI PPLNPVE PVEDYPTSGANVLSTKV* 

PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 
I 


6783 


3 


1750 


SYHHHHAQQSAAAS PNLTAS QKTVTTTSMITTKTLPLVLKAATA 
TMPASVVGQRPTIAMWAINSQKAVLSTDVQNTPVNLQTSSKV'r 

GPGAEAVD I VAKMTVTT^VDIVTDDrkD T WDrkt?T t5 T> nDT rnr»r» nmn 

LPQVR PKPVAQNNI PIAPAP PPMLAA PQL I QR P VMLTKFTPTTL 
PTSQNS I H P VRWNGQTAT IAKTFPMAQLTS I VIATPGTRLAGP 
QTVQLS KPS LEKQTVKSHTETDEKQTES RTI TP PAAPKP KRE BN 
POKLAPMVSLGLVTHDHLEEIQSKRQERKRRTTANPVYSGAVFB 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTSPQS 
SHPDS PENEKIETTFTF PAP VQP VS LP S PTSTDGD I H ED FCS VC 
RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMWICPRCQDQMLK 
K2 EAI P W PGTLAI VHS YI A YKAAKEEE KQKLI.K WSS DLKQEREQ 
LEQKVKQLSNS I S KCMEMKNTI LARQKEMHS SLEKVKQLIRL I H 
G I DLSKP VDSEATVGAI SNGPDCTP PANAATSTPAPS PSSQS CT 
ANCMQGEETK 


6784 


3 


1750 


SYHHHHAQQSAAAS PNIiTASQKTVTTTSMITTKTLPIjVLKAATA 
TMPASWGQRPT IAMVTAINS QKAVI*STDVQNTPVNI*QTS SKVT 
vjr\anctf\v\z 1 VAAM 1 VI JjUVUATPPQPIKVPQFIPPPRLTPRPNF 
LPQVRPKPVAQNN I PIAPAP P PMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS IHPVRWNGQTATI AKTFPMAQLTS IVI ATPGTRLAGP 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
ryiuwu 1 1 j v ojjjjj v i njjfiijnc. x yoJS^UJiJtKitRXTANP VYSGAVFE 
PERKKS AVT YLNSTMHPGTR KRGR P PKYNAVLGFGAI/TPTSPQS 
SHPDS PENEKTETTFTFPAPVQPVSL PS PTSTDGD IHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKT1PKGMWICPRCQDQMLK 
KEEAI PWPGTIAIVliSYXAYKAAKEEEKQKLIiKWSSDLKQEREQ 
LEQKVKQLSNSISKCMEMKNTII1ARQKEMHSSLEKVKQI.IRLIH 

GIDLSKPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 
ANCNQGEETK 


6785 


1 


528 


LGNrVLHYCSMYSKPECLKLLLRSKPTVDIVNQAGETALDIAKR - 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQEEIDESDDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RSPKVLVLAPTRELANHVSRDFKDINTRKLTVARFYGGTSYQSQ - 
INHXRKGIDILVGTPGRIKDHLQSGRLDLSKLRHWLDEVDQML 
DLGFAEQVEDI I HES YKTDS EDKPQTLLFS ATCPQWVYTVA\ KK 
YMKSRYEQVDLDGKWQKAATTVEHXAIQCHWSQRPAVIGDVLQ 
VYSGS BGRAIIFCETKKNVTEMAMNPHIKQNAQCIiHGDrAQSQR 
E I TLKG FREGS FKVL VATNVAARGLDI PEVDLVIQSS PPQDVBS 
Y IHRS GRTGRAGRTGI CICF YQ PRERGQLR YVEQKAG ITFKRVG 
VPSTM DLVKS KSMDAI RSLAS VSYAAVDFFRPSAQRL IEE KGAV 
DALAAALAHI SGAS S FEPRSLITSDKGFVTMTLESLEEIQDVSC 
AWKELNRKLSSNAVSQITRMCLLKGNMGVCFDVPTTESERLQAE 
WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 
FD* VF YHL VDFLSDFLVDSVYLTGRQ IDHLTGLTGL IDHLTSHS 
SVWN 


6787 


2646 


227^0 


PSSFPKNVPLEELEEPPK*KRSGLGSLTPKSQIQNGP*PQTFFP 
FELGSPSGVISAHCNLRLLGSSDSPAPASRVAGIIGTCHHAWLI 
LVFLVBMGFHHVGQAGLKLLTL\ VIHP PWPPKVLGLQT 


6788 


16 


936 


GGTVDLR\DI^VSVLAA\mGGR/ATVRRVRESNVLHEKSKGKT 
REGAEDKMTSGDVLSNRKMFYLLKTAFPSVQrNTEEHVD\ELDQ 
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amino acid 
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Amino acid segmenc containing signal pepticSe"*" 
(A-Alanine, OCysteine, D*Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, Glycine, 
H«Histidine, I=Isoleucine, KoLysine, 
L=I.eucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
_ \°pos8ible nucleotide insertion) 


6789 






EVILWGS*Uy*UliPKGK*LliP]KEVPSR/RVLLSGLTPLDATQEV 
FTEDLS K\ YVTTMVCVAVNG KPMLG V I H KP PSEYTAWAMVDGGS 

NVKARSS YNEKTPRI VVSRSHSGMVKQVALQTFGNQTTI I PAGG 
AGYKVIJU^DVPDKSQEKADLYIHVTYIKKWDICAGNAILKALG 
GHMTTLSGEE I S YTGSDG1 EGGLLAS IRMNHQALVRKLPDLEKT 
GHK 


6790 


2 


678 


GNG INVLK1 APESAI KFMAYEQ I KRLVW * * PGDS * G F/ YERLVA 
GSLAGAIAQSSIYPMEVLKTRMALRKTGQYSGMLDCARRILARE 
GVAAFYKGYVPNMLGIIPYAGIDIAVYETLKNAWLQHYAVNSAD 
PG V P VLLACGTMS S TCGQ LAS YP LALVRTRMQAQAS I EGAP EVT 

MSSLPKHILRTEGAFGLYRGLAPNFMJCVIPAVSISYWYENLKI 
TLGVQSR 


6791 


2 


4068 


APPAGRRRMQAAPRAGCGAALLLWIVSSCLCRAWTAPSTSQKCD 
EPLVSGbPHVAFSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQISAIATQGRYSSSDWVTQYRMLYSDTGRNW 
KPYHQDGNIWAFPGNINSDGWRHELQHPIIARYVRIVPLDWNG 
EGRIGLRIEVYGCSYWADVINFDGHVVLPYRFRNKKMKTLKDVI 
ALNFKTS E SEGVI LHGEGQQGD Y ITLBLKKAKL VLS LNLGSNQL 
GPIYGHTSVMTGSLLDDHHWHSWIERQGRSINLTLDRSMQHFR 
TNGE FD YLDLD YE I TFGG I P PSGKPS SSS R KNFKGCMES INYNG 
VNITDLARRKKLEPSNVGNLSFSCVE PYTVPVFFNATS YLEVPG 
RLNQDLFS VS FQFRT WNPNGLLVFSH FADNIX5NVE IDLTESKVG 
VHIN1 TQTKMSQID I S SGSGLNDGQWHE VR FIAKEN FAI LT I DG 
DEASAVRTNSPLQVKTGEKYFFGGFLWQMNN2SHSVLQPSFQGC 
MQLIQVDDQLVNBYEVAQRKPGSFANVSIDMCAI IDRCVPNHCE 
HGGKCSQTWDSFKCTCDETGYSGATCHNSXYEPSCEAYKHLGQT 
SNYYWIDPDGSGPLGPLKVYCNMTEDKVWTIVSHDLQMQTPWG 
YNPEKYS VTQLVYSASMDQ I SAI TDSAB YCEQ YVS YFCKMSRLL 
NTPDG3 P YTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 
YYCNCDADYKQWRKDAG FLS YKDHLPVSQWVGDTDRQG SEAKL 
SVGPLRCQGDRNYWNAASFPNPSSYLHFSTFQGETSADISFYFK 
TLTPWQVFLENMGXEDFI3CLBLKSATE VS FS FDVGNG PVEI WR 
S PTPLNDD QWHRVTAERNVKQASliQVDRLPQQ IRKA P TEGHTRL 
EliYSQLFVGGAGGQQGFLGCIRSLRMNGVTLDLEERAKVTSGFI 
SGCSGHCTSYGTNCENGGKCLERYHGYSCDCSNTAYDGTFCKKD 
VG AFFEEG MWLRYWFQ APATNARD S S SR VDNAPDQQNS HPDLAQ 
EEIRFSFSTTKAPCILLYISSFTTDFLAVLVKPTGSLQIRYNLG 
GTREP YNI D VDHRNMANGQPHS VN I TRHBKTI FLKLDHYPS VS Y 
HLPSSSDTIiFNSPKSLFLGKVIETGKIDQBIHKYNTPGFTGCLS 
RVQFNQIAPLKAALRQTNASAHVHIQGELVESNCGASPLTLSPM 
SSATDPWHLDHLDSASADFPYNPGQGQAIRNGVNRNSAI IGGVT 
A\ WI FTP S 1*CTP \ VLP * SR * HVS PHKGTLP I PNEAKGAGSRQK 
KPGRRPSMMNDpPTSQRPIDESiCKEWPHLRGGYLAMG 


1 6792 


1801 


n$3 — 


tghegakgekgdkgdlgprgergqhgpkgekgypgippeiTpgw" 

SAW*SWLTAASTKVQAILLPQPLE* LGLQIAFMASLATHFSN Q 
NSGIIFSSVETNIGNFFDVMTGRFGAPVSGVYPFTFQMMKTTOnif 

BEVYVYLMHNGNTVTSMYSYEMKGKSDTSSNHAVLKLAKGDEVW 
LRMGNGALHGDHQRFS TFAGFLLFBTK 




33 


1073 

i 
2 


VRHTNWGVDMY LFS LGSESPKGAIGH I VSTEKTI LAVERNKVLL 
PP LWNRTF S WG FDDFS CCLGS YGSDK VLMTFENLAAWGR CLCAV 
CPS PTTI VTS GTSTVVCVWELSMTKGRPRGLRIiRQAL YGHTQAV 
rCLAASVTFSHiVSGSQDCTCILWDLDHIiTHVTRLPAHREGISA 
I TI SD VSGT I VS CAGAHLSLWNVNGQP LAS I TTAWGPBGA ETCC 
:lmegpawdtsqi I ITGS QDGMVK VWKT/VGCED VCS WTASRRG 
\PGSASKPKRPQVGEEPGLESRAGR* HCFDREAQQNQ P \ PVTAL 

wsrnhtkllvgdergrifcwsadg*eergsrgsgttvpg 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aapartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HeHistidine, I=Isoleucine, K=l*ysine, 
L^Leucine, M*Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine / T= Threonine, v» Valine, 
W=Tryptophan, YoTyrosine, X=Unknovn, *aStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


£793 


2340 


805 


GR KE ANY \ YGSLTOAGTVSLG LDAEGQEVFVP FSAVL PMVAPND 
LVFDGWD I SS LNLAEAMRRAKVLD WGLQEQLW PHMEALRPRPS V 
Y I PEFI AANQS ARADNLI PGS RAO QLEQIRRD IR DFR SSAGLDK 
VIVLWTANTERFCEVIFGLNDTAENLLRTIBLGLSVSPSTLFAV 
ASILEGCAFLNGSPQNTLVPGALELAWQHRVFVGGDDFKSGQTK 
VXS VLVD FL I G SGLKTMS I VSYNHLGNNDGBNLS APLQ FRS KE V 
SKSNWDDMVQSNPVLYTPGEEPDHCWIKYVPYVGDSKRALDE 
YTSELMLGGTNTLVLHNTCEDSLLAAPIMLDLALLTELCQRVSF 
CTDMD PE PQTFH PVLS L1»S FLFKAPIiVPPGS PVVNAIjFRQRS C I 
ENILRACVGLPPQNHMLIjEHKMERPGPSLKRVGPVAATYPMLNK 
KGPVPAATNGCTGDANGHLQEEPPMPTT*GPGHTVSRI*FLPAAP 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6794 


169 


1349 


DDVKRKPEASAH* EKPGPPSRPGVRGGRERAGGRGSHGARS CR\~ 

EPAPPAPAPPEDHPDEEMGFTIDIKSFLKPGBKTYTQRCRLFVG 

NLPTDITEEDPKRLFERYGEPSEVFlNRDRGFGFIRliESRTLAE 

FSQFGPVEKAWWDDRGRATGKGFVEFAAKPPARKAIiERCGDG 
AFLLTTTPRPVIVBPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
R FAQ PGTFE FEYASRWKALDEMBKQQREQVDRNI REAKE KLEAE 
MEAARHEHQLMLMRQDLMRRQEELRRLEELRNQELQKRKQIQLR 
HEEEHRRREEEMI^HREOEELRRf5nEnPKT>?Jrv , Mi» , iJVvr*»WT?r 0 


679* 


1740 


1010 


gprrqtqvrdiieldsf*dwaaqbtdcaqnsgerl*kgv;lenfs 

TMS KSAVKISIJ)LLSNPLCEQDQDLIjNMVTALDTAMKRMDAFNQ 

ekotqiqktvieplkkfgsvfpslnmavkrreqalqdyrrlqak 
vekyeekektgpvlaklhqareelrpvredfeaknrqlleempr 

FYGSRLDYFQPSFESLiIKAQWYYSEMHKIFGDLSHQLDQPGHS 
DEQRERENEAKLSELRALS IVADD 


6796, 


48 


683 


GKE IQI PTIKLAWLLFGLE * PVGALGKGVVSF* * SHVALGQLGW 
LTRAVRSSWRWBLCVSAQEWSQRSA*SSP3PVGACPSLNPPET 
SVQEGRDCWQR*LPRLFSALVGQPGCWPQGAPPERCV*PGRCKW 
HLQSQ VLR*ERRRCCRCL PRFA*GWRRRHQRLG LG I HPAPLGST 
SPPHPEGNSQQCRR *GWAAELRLPS S WL * GKLGC* 


6797 


1620 


211 


TERMTPSQPTRGSSCTRFSSMLWTSTWRCXiTCHWAGMRMSWGV 
TI/3PMAQGLLSASGTTXEATWTRPTTHLTLIRWWLLTASRVDPP 
BRPPPPPSDDLTLLESSSSYKNL/DAQIPQ/DWSMSPSTSG*RP 
IiTSRASSIMRSRTAIPSAS*SRLTTKHTVGGSPSAWRPRPTSRS 
VSTPVSSSTETTASGS CLTWWSSSPAPCPSS SAPAHSFEASCCK 
TSLWGSCGGSGDGSSACGSGWNLSMAGTSCSSPAMCSPSRAPS* 
RSASR PRTWRATTS AASSWAPRRCW CGWA* S AT* PSSTTTISSS 
PHCGWPCPASCASAAAWLSSTWATASVAGSCWGPIM^SSAHSPW 
CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCGSSPSSTFTPSS 
ASSSTWCSASSSRSSPAPTTPSSIPAAQAQRRASCRPTSHSART 
APPPAS SAAGAARPAAFSAAAEGTPRRS I RCW 


6798 


3894 


1696 


stiswesleswlnkatnp3nrqedweyiigfcdqinkeiieg*vs 
alwgqlrgsglgrgttmakegqpgsprjlsalecvllvpq\pqia 
vrlitahkiqs pqeweaiiqaltylgdrvs bkvktkv iellys wtm 
alpbeakikdayhmlkrqgivqsdppipvdrtlipsppprpknp 
vfddeekskllakllksknpddlqeankliksmvredeariqkv 
tkrlhtleevnnnvrllsemllhysqeds sdgdrelmkelfdqc 
enkrrtlfklasetedndnslgdiloasdnlsrvinsyktiieg 
0vingevatltlpd5egnsqcsnox3tlidlaeldttnslssviia 
paptppssgrpllppppqasgpprsrsssqaeatlgpsstsnal 
s wldeellclglad papnvp pkesagnsqwrllqreqs dldffs 
prpgl'aacgasdapllqpsapsssssqaplpppfpapwpasvp 
APS AGSSLFSTGVAPALAPKVB pavpghhglalgnsalhhldal 
DQLLEEAKVTSGLVKPTTSPLI PTTTPARPI*LPFSTGPGS PLFQ 
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amino acid 
residue of 
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Predicted end 
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Amino acid segment containing signal peptide" 1 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, l=Isoleucine, K= Lysine, 
L=Leucine, Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLSFQSQGSPPKGPELSIiASIHVPLESIKPSSALPVTAyDKNGP" 
RILFHPAKECPPGRPDVLWWSMLNTAPLPVKS IVLQAAVPKS 
MKVKLQPPSGTBLSPPSPIQPPAAITQVMLLANPLKEKVRLRYK 
LTFALGEQLSTBVGEVDQFPPVEQWGNL 


6799 


3894 


1696 


stiswesleswlnkatnpsnrqedweyiigfcdqinkbleg*vs 
alwgqlrgsglgrgttmakegqpgsprlsalecvllvpq\pqia 

VRLLAHKIQSPQEWEALQALTYLGDRVSEKVKTKVIELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDEBKSKLLAKLLKSKNPDDLQEANKLIKSMVRBDEARIQKV 

ENKRRTL FKLAS ETE DNDNS LGD ILQASDNLSRVINS YKTI 1 EG 
OVINGBVATLTLPDSEGNSQCSNQGTLlDLAELDTTNSLSSVXiA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEBLLCLGLADPAP3»VPPKESAGNSQVfHLljQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAFWPASVP 
APSAGSSLFSTGVAPALAPKVEPAVPGHHGLALGNSALHHLDAL 
DQLL E E AKVTS GL VKP TTS PL I PTTTPARP LLPF5TGPGS PLFQ 
P IS FQSQGS P PKGPE LSLAS IHVPLES IKP SSALPVTAYDKNGF 
R ILFHFAKECPPGRPDVLVWVSMLNTAPLPVKS IVLQAAVPKS 
niw i\Lt\ji>v^[a i fcLbPr SPIQPPAAITQVMLLANPLKEKVRLRYK 
LTFALGEQLS TEVGEVDQFPPVE QWGNL 


6800 


404 


1646 


RRSPSTGLSPVPQPSSPSLSDYSI?WSLLLSGTIAWATPGK*AG 
T P QAW * LGLAPAIAF I / GLTRGRKQNKEKMAEGGSGDVDDAGDC 
SGARYNDWSDDDDDSNESKS IVWYPPWAR IGTEAGTRARARARA 
RATRARRAVQ KRAS PNSDDTVLS PQELQ KVLCLVEMS EK P YILE 
AALIALGNNAAYAFNRDI IRDLGGLPIVAKILNTRDPIVKEKAL 
I VXiNNLS VNAENQRRLKVYMNQVCDDTI TSRLNS SVQLAGLRLL 
TNMTVTNE YQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAE 
WPAMTRELLRAQVPSSLG\SLFNKKENKEVILKLLVIFENINDN 
FKWEENEPTQNOFGEGSLFFFLKEFQVCADKVLGIESHHDFLVK 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFE S QQAS VTMHDVDAES FB VL VDi C Y^ (3RVSLS EANVERL 
YAASDMLQLEYVREACAS FLARRLDLTNCTAI LKFADAFGHRKL 

RSQADSYT AfJWPKnT.CUM^CTOlTirTT jvtvt t»t ant t 7itrr m ncr t«k 

VE SEQTVCHVAVQWLEAAPKBRG PSAAE VFKCVRWMH FTEEDQD 
YLEGLLTKPIVKKYCLDVIEGALQMRYGDLLYKSLVPVPNSSSS 
/R *QQQLSCI CSRKSTPETG YVCQGDGDLLWTPQRSLS \RYDP Y 
S GDI YTMPS PLTSFAHTKTVTS SAVCVSPDHD I YLAAQPRKDLW 
VYKPAQNS WQQLADRLLCREGMDVA YLNGYI YXLGGRDP I TG VK 
LKEVBC YS VQRNQWALVAP VPHS FYS FELI WQNYLYAVNS KRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDEIYCICDIPVMJCVYN 
PARGBWRRISNIPLDSETHNYQI VNHDQKLLL ITSTTPQWKKNR 
VTVYEYDTRBDQWI NIGTMLGLLQFDSGFI CLCARVYPSCLEPG 
QSFITEEDDARSESSTEWDLDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRNAQDQQGSL 


6802 
6803 


157 
1 


1341 
2203 


ET FPLFFFLLSKTPG KTASMAHFVQGTS RM I AAE^STEkKECAB 
PSTRKNLMNSLEQKIRCLEKQRKBLLEVNQQWDQQFRSMKELYE 
RKVAELKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRQRDLT 
RDRLQREEKEKBRLNEE^ELKEENKLLKGKNTLANKEKEHYEC 
EI KRLNKALQDALNIKCSFSEDCLRKSRVEFCHEEMRTEMBVLK 
QQVQIYEEDFKKERSDRERLNQEKEELQQINETSQSQLNRLNSQ 
I KACQMEKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCLAPPPVCCQAG/PR 
TPGLK*SSCLWLPKC*NFRFILSKESPSVEVHTNRERQQATRBR 
G 

KLSGRPYRHMGVLGTS KLYDIRKTI FTFTPQF IDQQQFYLALDN 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amo.no acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=*Aspartic Acid, £ 9 
Glutamic Acid, F= Phenyl alanine, G=»Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MsMethionine, N=Asparagine, 
P»Proline, QoGlut amine, R-Arginine, 
S-Serine, T=Threonine, VaValine, 
W=»Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KM I VE MLRTDLS Y LCSRW RMTGQPT ITF P I SHS MLD £ DGTSLbl S 
S I LAALRKMQDG YFGGARVQTGKLSEFLTTSCCTHLS FMDPGPE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAP HP KLAPTS Q KGGLDR FQAAVQTTCDLMSLVTKAKELHVQ 
NVHMYLPTKLFQASR PS FNLLDS PHPRQBNQVPSVRVE IHLPRD 
QS GE VDFKALVLQLKETS SLQEQAD I LYMLYTMKGPDWNTEL YN 
ERSATVRELLTELYGKVGE IRHWGLIRYISGI LRKKVEALDEAC 
TDLLSHQKHLTVGLPPFPREKTI SAPLP YEALTQLIDBASEGDM 
SI SILTQE IMVYLAMYMRTQPGLFAEMFRLRIGLIIQVMATELA 
HSLRCSAEEATEGLMNLSPSAMKNLLHHILSGKEFGVERK/SVR 
PTDSNVS PAI S IHE IGAVGATKTERTG I MQLKSE IKQVE FRR LS 
I SA3SQS PGTSMTPSSGS FPSAY DQQSS KDS RQGQWQRRRRLDG 
ALNRVPVGFYQKVWKVLQKCHGLS VEGF VLP SSTTREMTPGEI K 
F3 VHVES \VLNVLLRPE YRQIiLVEAILVLTWLADIEIHS IGSI I 
AVEXI VH I ANDI*F3jQEQ KTIGP \ DDTM LAKD PASG\ I CTLR\ YD 
SAPSGRFGTMTYLS \RAA\ATY VQEFLP\HS I CAMQ 


6804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLE E KRKSLRTTG F YSGFS EVAE KR I KLLKNS DERLQ NS RAKDR 
KDVWSSIQGQWPKKTLKELFSDSDTEAAASPPHPAPEEGVAEES 
LQT VAEE ESCS PSVEI»B KPPP VNVDS KP I EE KT VE VNDRKAE FP 
S SGSNFS A* I PLP YLHLNRLHQ S L * QKGSRQQS S VT VSE PLAPN 
QBE VRSI KSETDSTIE VDS VAGELQDLQSERB * LAS R F * CQCKL 
KQ * *SARTRTS * KSLYRSEKSBRCSGRRKFI KKAEKKP * SNSGK 
QQKEGKRHK 


6805 


1539 


206 


RQPDLKYFGKSFDVSVSESSSLIiSNDLPKFADGIKARNRNQNYL 
VPSPVLRILDHTAFSTEKSADIVICDEECDSPESVNQQTQEESP 
IEVHTABDVPIAVEVHAISEDYDIETENNSSESLQDQTDEEPPA 
KLCKILDKSQALNVTAQQKWPLLRANS SGLY KCELCE FNSKYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTNMLL I EHAKLHEEDPYI 
CKYCDYKTVIF3NLSQHIADTHFSDHLYWCEQCDVQPSSSSBLY 
LHFQEHSCDEQYLCQFCEHETNDPEDLHSHWNEHACKLIELSD 
KYNNGEHGQYSLLS KI T FDKCKNFFVCQ VCG FRS RLHTNVNRHV 
AIEHTKI FPHVCDDCGKGFSSMLE\ IAKHLNSHLSBG I YLCQYW 
E YS TGQI EDLKIHLD FKHS ADL PHKCSD CLMRFGNERELI SHLP 
VHETT 


6806 


272 


3794 


VALCF PNSDPVM FMDAF YG CLLAELG P V PI EVPLTRKDAGSQQV 
GFLLGS CGVFLALTTDACX?KGI;PKAQTGEVAAFKG WP PLS WLVI 
DGKHIiAKPPKDWHPLAQDTGTGTAYIEYKTSKEGSTVGVTVSHA 
SLLAQCRALTQACGYSEAETLTNVLDFKRDAGLWHGVLTSVMNR 
MHWSVPYALMKANPLSWI QKVCFYKARAALVKSRDMHWSLLAQ 
RGQRDVSLS SLRMLIVADGANPWS I SS CDAFLNVFQSRGIiRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TBEKLS VLTVQDVGQVMPGANVCWKLEGTPYLCKTDEVGE ICV 
SSS ATGTAYYGLLG ITKNVFEAVPVTTGGAP I FDRP FTRTGLLG 
FIG P DHLVF I VGKLDGLN VTG VRRHHADDWATALAVE PM KFVY 
RGRIAVFS VTVLHDDRI VLVAEQRPDASEEDS FQ WMSRVLQA I D 
SIHQVGVYCIALVPANTLPKAPLGG IKI SETKQRFLEGTLHPCN" 
VI^CPHTCVTNLPKPRQKQPEVGPASMIVGNLVAGKRIAQASGR 
EI^LEDSDQARKFLFLADVLQWRAHTTPmPLFLLLNAKGTVT 
STATCVQL9KRAERVAAALMEKGRLSVGDHVALVYPPGVDLIAA 
FYGCLYCGCVPVTVRPPHPQNLGTTLPT\n<MIVEVSKSACVLTT 
QAVTRLLRSKEAAAA VDIRTWPTILDTDD IPKKK I AS VFRPPS P 
DVLAYLD FSVSTTG ILAG VKMSHAATSALCRS I KLQCELYPSRQ 

iaicldpycglgfalwclcsvysghqsvtivpplelesnvslwlis 
avs q ykarvtfccys vmbm ctkglgaqtgvlrmkg vnls cvrtc 
mwaeerpVrialtqsfsklfkdlglparavsttfgcrvwvaic 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K* Lysine, 
L=Leucine, M*Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=J\rginine, 
S=Serine, T=Threonine, V-Valine, 
WaTryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQGTAGPDPTTVYVDMRAIiRHDRVRLVERGSPHSLPLMESGKIL 
PGVKVI IAHTETKG PLGDSH LGEI WVS SPHNATG YYT V YGEEAL 
HADHFSARLS FGDTQT IWARTC YLGFTiRRTELTDASGGRHDAL Y 
WGSLDETLBLRGMRYHPIDIETSVIRAHRS I AECAV FTWTNLL 
VWVELDGLEQDALDIiVALVTNWTjEBHYLWGVWI VDPGVI p 
INSRGEKQRMHLRDGFLADQLDPIYVAYNM 


6807 


1444 


606 


VGHDT\fflAMFTCFPKCLGFSPPVNVTVSPRSEESHTTTVSGGNG " 
. SVFQAGPQLQALANLEARRGSIGAALSSRDVSGLPVYAQSGEPR 
RLTQAQVAAPPGENALEHSSDQDTWDSLRSPGPCSPLSSGGGAE 
SLPPGGPGHAEAGHLGKVCDFHLNHQQPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLN 

rfpcgmevhsgqreleswavgeamaXlkfpmgamsyclrdrsr 

FLFRLPMGLSCPLQVQ 


| 6808 


2063 


737 


GVGSGAASALARSRPLASRliSSRRRTRAPRSGAMQRLAMDLRML 

sreioslylehqvrvgffgsgvglslilgfsvayafyylssiakk 
pqi.vtggesfsrflqdhcpvvtetyyptvwcwegrgqtllrpf\ 

1 TS KPPVQYRNEL IKTADGGQI S LDWFDNDNSTC YMDASTRPT I 
LLLPGLTGTS KES YI LHMIHLS E ELGYRCWFNN RG VAGENLLT 

PRTYCCANTEDLETVlHHVHSLYPSAPFLAAGVSMGGMLIiLNYL 
GKIGSKTPLMAAATFSVGWMTParQPQr.RVDr ktut r wmwt 

LQSSVNKHRHMFVKQVDMDHVMKAKS IREFDKRFTSVMFGYQTI 
DD YYTDASPS PRLKS VG I PVLCLNS VDDVFS PSHAI P I ETA KQN 
PNVALVLTS YGGH1 G FLEG I WPRQS T YMDR VFKQFVQAMVBHGH 
ELS 


6809 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTPSQPLHPSDPTEKQQPKRLHVSNI PFRFRDPDLRQMF 
GQFGKILDVEIIFNERGSKGFGFVTFETSSDADRARPvt uhttxt 
EGRKIEVNNATARVMTNKKTGNPY TNGWKLNP WGAVYG P3 fya 

vtgfpypttgtavayrgahlrgrgravyntfraapppppiptyg 

AWYQDGFYGAEI \LEATQPTDTLS PLQRRQ PTATVTAESTQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6B10 
*B11 


939 


£5 


dysgqtpvptehgmtlytpaqthpeqpgseastqpiagtqrvpq 
tdeaaqtdsqplhpsdptekqqpkrlhvsnipfrfrdpdlrqmf 

GQFGKILDVBI I FNERGSKGFGFVTFETSSDADRAREKLNGTI V 
EGRKI E VNNATAR VMTNKICTGNP YTNG^LNP VVGAVYGPEFYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPPIPTYG 
AWYQDGFYGAS I \LEATQPTDTLSPLQRRQPTATVTAESTQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 




1522 


650 


DLVTVWSFVDCRVIASTHGH\KSWVSWAFDPYTTSVEEGDPME 
PSGS DEDFQDLLHFGRDRADS TQCRLSRRNSTDS R P VSVT YRFG 
SVGQDTQLCLWDLTEDILFPHQPLSRARTHTNVMNATSPPAGSN 
GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGAIASGV 
SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 
TDPAKTLGTPLCPRMEDVPLLEPLICKKIAHERLTVLIFL2DCI 
VTACQEGFICTWGRPGKVVS FNP 


6812 


4001 


» 


EDAVFSLDLSTI I QGTMFLHGEEL KSNEP EGQVE PGALR YR IEQ 
KGLQHRLI LHAVKHQDSGALVGFS CPGVQDSAALTIQESPVHIL 
SPQDKVSLTFTTSERVVLTCELSRVDFPATWYKDGQKVEBSELL 
VVKMDGRKHRLI L PEAKVQDS GEFECRTEGVSAFFGVTVQDPPV 
HIVE)PREHVFVHAITSBCVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFWLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTr 
TDVSS W I VYPSGKVYVAAVRLERVVLTCELCRP VJAE VRWTKDGE 
EWES PALLLQ KEDTVRRLVL PAVQLEDSGEYLCE I DDESAS FT 
VTVTEPPVRI I YPRD E VTL I A VTLE CWLMCELSRE DAPVRWYK 
DGLEVEESEALVLERDGPRCRLVLPAAQPEDG GE FVCDAGDDSA 
FFTVTVTEPPVQFLALETTPS PLCVAPGEPWLSCELSRAGAPV 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
co first 
amino acid 
residue ot 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
{A*Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, GeGlycine, 
HaHistidine, I=»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, GsGlutamine RaAsraln-i n» 
S=Serine, T«Threonine, VoValine, 
W«Tryptophan, Y=Tyrosine, X=tfnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VWSHNGRPVQEGEGLELHAEGPRRVLCIQAAGPAHAGIiYTCQSG 
AAPGAPSLSFTVQVAEPPVRWAPEAAQTRWSTPGGDLELWK 

GEYLCDAPQDSRIFLVSVEEPLLVKLVSDLTPLTVHEGDDATFR 
CEVS PPDADVTW LRNGAWTPGPQRQS CCS YGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


6813 




836 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKMHCRDCALVTSSGHL 
IiHSRQGSQI DQTECV3 RMNDAPTRG YGRDVGNRTSLRVIAHSSI 
QRILRNRHDLLNVSQGTVFI FWGPSS YMRRDGKGQVYNNLHLLS 
QVLPRI>KAFMITRHKMLQFDELFKQETGQ\NRKrSNTWLSTGWF 

i rl l A/vuCtiJV^lJil i. JM v i (jW^a* *r CKJJr NHFS V r I H YYSPFGPDEC 

TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFPQPDWKPES 
LAINHPENKPVF 


6814 


3 


737 


K FRRQ SAN/ AR ERNRMHGLND ALDNLRKW P CYS KTQ XLS K I ET ~ 
LRLAKN YZ WALSEI LR IGKRPDLLTFVQNLCKGLSQ PTTNLVAG 
CLQLNARS FLMGQGGEAAHHTRSP YSTFYP PYHSPELTTPPGHG 
TLDNS KSMKP YNYCSAYE S FYESTS PECAS PQFEGPLSPPP INY 
Wv» I rbi^^aiiTl^xGKJJYNYGMHY 
SI I FP YDLHLRS Q3 LTMQDE LNAVFHN 


6815 


sod 


553 


QGLDPASQTKWELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE 
DVGPTAEWNGDGSGSLRRSGSPGKLRDAIiRRSSEMLVKKLQGGT 
PQEP PNPRMKR AS SLNFLNKSVEEPTQ PGG 




1 


803 


NL LKTHKF\I>LGQDEDSLHS VPVAQMGNYQE YLKTLASPLRE ID 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
SPMSSKRRJRSMSLIjLRKPQTPPTVTNHVGGKGPPSASWFPSYPN 
LI KPTLVHTDATI IHLX3HEEKMENGQITPDGFLSKSAPSBLINM 
TGDLMPPNQVDSLSDDFTSLSKIXSLIQKPGSNAFVGGAKNCSLS 
VDDQKDPYASTLGAMPNTLQITPAMAQGINADI KHQLMKBVRKF 
GRSK 


6817 


172 


3457 


LGMMDS PKIGNGLPVIGPGTDIGI SSLHMVGYLGKNFDSAKVPS 
DEYCPACKEKG KLKALKTYRISFQESI FLCEDLQCI YPLGSKSL 
NNIi IS PDLEECHTPHKPQKRKS L ESS YKDSLLLAHSKKTRNYI A 
I DGGKVLNS KHNGEVYDET 3S NLP DSSGQQNP I RTADS £>ERNE I 
LEADTVDMATTKDPATVDVSGTGRPSPQNEGCTSKLEMPLESKC 
TSFPQALCVQWKNAYALCWLDCI LSAIiVHSEELKNTVTGLCSKE 
ES IFWRLLTKYNQANTLLYTSQIjSGVKDGDCKKLTSKI FAEIET 
CLNEVRDEIPISLQPQIJICTLGDMESPVFAFPLLLKLETHIEKL 
FLYSFS WDFECSQCGHQ YQNRHMKSLVTFTNVI PEWHPLNAAHF 
GPCNNCNSKSQIRKMVLEKVSPIFMLHFVEGLPQNDLQHYAFHF 
EGCLYQ ITS VI Q YRANNHFI TWILDADGS WLECDDLKGPCS ERH 
KKFEVPASEIH I VI WERKIS QVTDKEAACLP LKKTNDQHALSNE 
KPVSLTS CS VGDAAS AETAS VTHPKDI S VAP RTLSQDTAVTHGD 
HUjSGPKGLVDNri»PLiTLiERTTmCTa.«;v^nr WQVAPr \t vhitdi? 

AENTGILKTNTLLSQESLMASSVSAPCNEKLIQDQFVDISFPSQ 
VVOTNMQSV^LNTEDTVNTKSVNNTDATGLIQGVKSVEIEKDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKKKETTADSQTTTSKSLQNQ 
SLKENQKKPFVGSWVKGLISRGASFMPLCVSAHNRNTITDLQPS 
VKGVNNFGGFKTKGINQKASHVSKKARKSASKPPPI SKPPAGPP 
SSNGTAAHPHAHAAS EVLEKSGSTS CGAQLNHSS YGNG I S SANH 
EDLVEGQIHKLRLKLRIOKLKAEKKKIiAAI/^SPQSRTVRSBNLE 
QVPQDGS PNDCES IEDLLNELP YP IDIANESACTTVPGVSL YSS 
QTHE E I IiAELIjS PTP VSTELS ENGEGDFRYLGMGDSH IP PPVPS 
EFNDVSQNTHLRQDHNYCS PTKKNPCEVQPDSLTNNACVRTLNL 
ESPMKTDI FDEFFSSSALNALANDTLDLPHFDEYLFENY 


6818 


2 


240 


RG FDKVLWT/LS GAVK \CVQ FSR 1 3 PDGEEG YPG E uKVWVT YTL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anu.no acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D«Aspartic Acid, B= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, KsLysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glu t amine , R=Arginine , 
So Serine, T=Threonine, VeValine, 
W=Tryptophan, Y=Tyrosine, X=Uaknown, *«Stop \ 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








DGGE/LHS / ATTEHKP /VQATP VNLT\TI LTSTWQARLPQI 


6819 


1 


961 


G I PCTBMGNFDNANVTGEI B FAIH YCFKTHSLE IC1 KACKNLAY 
GEEXKKKCNPYVKTYLLPDRSSQGKRKTGVQRNTVDPTFQETLK 
YQVAPAQLVTRQLQVS VWHLGTLARRVPLGKVI I PLAT WD FEDS 
TTQSFRWHPLRAKADKYBDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEXTTDOPSLHGQLCLWLGAKNLPVRPDGTLNSFVKGCLTLP 
DQQKLRLKSPVLRKQACPQWKHSFVFSGVTPAQIjRQSSLELTVW 
DQALFGMNDRLLGGT\RLGSKGDrAVGGDACSQSKLQWQKVLSS 
PNLWTDMTLVLH 


6820 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGFWEKYLKWRKHHRVIA 
GQ FFGHKHTDSFRMLYDDAGVP I SAMF I T PG VTPWKTTLPGWN 
GANWPAlRVFEYDRATI^IJa3MVTYFMNL50ANAQGTPRWBLBY 
QLTBAYGVPDASAHSMHTVLDR I AGDQSTLQRYYVYNSVS YSAG 
VCDEACSMQHVCAMRQVT)IDAYTT(^YASGTTPVPQLPLLLMAL 
LGLCT 


6821 


1088 


518 


K FDI YR/EVGG3FVPVTRDDSSNGFPRTQHGPSPTVri PIQS PQ*J 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI \SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


" 6822 


1088 


518 


EFDI YR/B VGGEF VP VTRDDSSNGFPRTQHGPS PTVHPI QS PQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLG FDECG I VAQI AGPLAAADI SAYYISTFNFDHALVPEDGI 
GSVI EVLQRRQEGLAS 


6B23 




221 


PPKLLSRMARMGHGDBIV\LSDLNFPGLLHLPVVGPWRSVQTAC 
GIPQLLEAVLKLLPLDTYVESPAAVMELVPSDKERGLQTPVWTE 
YES I LRRAGCVRAXiAKIERFE FYERAKKAFAWATGETALYGNL 
ILRKGVLAfcNPLL 


6824 


858 


104 


LLLAQR WGWG \ CCFFSLAVS VKMNVLLFAPGHiFLUCt TQFGFRG ' " 

ALPKLGICAGLQVV1/5LPFLLENPSGYLSRSFDIX3RQFLFHWTV 

NWRFLPEALFLHRAFHLALLTAHLTLLLLFALCRWHRTGESILS 

LLRDPSKRKVPPQPLTPNQIVSTLFTSNFIGICFSRSLHYQFYV 

WYFHTI»pyLLWAMPARWLTHLLRLLVLGLIELSWNTYPSTSCSS 

AALH I CHAVI LLQLWLG PQPFPKSTQHS KKAH 


6925 


3 


1173 


SSGEFGLQASDIMWTISDTGWIDI ILCSLMEPKALGACTFVHLL ' 
PKFDPLVI LKTLSS YP I KSMMGAP I VYRMLLQQDLS S YKFPHLQ 
NCLAGGESLLPETLENVn^O/reLDIREFYGQTETGLTCMVSKTM 
KIKPG YMGTAAS CYDVQ1 IDDKGNVLPPGTEGD1GIRVKPIRPI 
G I FSG YVDNPDKTAAN IRGDFWLLGDRG IKDBDGYFQFMGRADD 
I INS SG YRIGPSEVENALMEHPAWETAVISS PDp VRGE WKAF 
VILALQFLSHDPEQLTKELQQHVKSVTAPYKYPRKIEFVLNLPK 
T VTGK I QRA\ KLRDKE WKMS GKAPCAVRHLRD IHLDS PLLS tiS F 
P FGPLALPMDG YGDSLWEEHEYKFCLALVISTKLYHVRC 


6826 


2304 


oca ; 


LKTES F KPW/ VNI ALAFHLLG ERAS PNSFWQP Yl QTLPRE YDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKL PL KDS FT YED YRVJAVSS VKTRQNO I PTEDGS RVT1AL I PLW 
DMCNH TNG L I TTG YNLEDDR CE CVALQD FRAGEQ I Y I FYG TRSN 
AEFVIHSGFFFD^SHDRVKIIOjGVSKSDRLYAMKAEVLARAGI 

ptssvfalhftepp isaqllaflrvfcmteeelkehllgdsaid 
riftlgnsefpvswdnevklntfledraslllktykttiebdks 
vlknhdlsvrak^iklri^ekeilekavksaavnreyyrq 
ekaplpkyeesnlgllessvgdsrlplvlrnleeeagvqdalni 
reaiskakatenglvngens i pngtrseneslnqeskravedak 

GSSSDSTAGVKE 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K»Lysine, 
u-uBucine, pi=FiEcnionme, N=Asparagine # 
PaProline, OGlutamine, R=Arginine, 
S=serine, T=Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6827 


1 


779 


SS WE FGI>S VLGGLFLLFVLENMLGLLRHRGLRPRCCRRKRRNL 
ETRNLDPENGSGMALQPLQAAPEPGAQGCREKNSQHPPALAPPG 
HQGHSHGHQGGTD I T WMVL LGDGIiHNLTDGLA IGAAFSDGFS S G 
LSTTLAVFCHELPHE LGDFAMLLQSGLS FRRLLLLSLVSGALGL 
GGAVLGVGLSLGP VP LTP WVFGVTAGVFLYVALVDML PALF PS S 
GAPAYA\HVLLQGLGLLLGGCLMLAITLLEBRLLPVTTEG 


£328 


3 


1654 


XSQHG/WILQLMHSCKEGYVKDLKGNPGLHRAMLDLDNGTRPSE 
LGHLSQTAS LKRGS S FQ SGRDDTWRYKT PHRVAFVBKLTKLVLS 
QLPNFWKliWISYVNGSLFSETAEKSGOIERSKNVRQRQNDFXXM 
IQEVMHSLVKLTRGALLPLSIRDGEAKQYGGWEVKCELSGQWLA 
HAIQTVRIiTHESLTALE IPMDLLQTIQDLILDLRVRCVMATLQH 
TAESIKRIiAEKEDWIVDNEGLTSLPCQFEQCrvCSLQSLKGVLE 
CKPGEAS VFQQPKTQEE VCQLS INI MQVFI YCLEQLSTKPDADI 
DTTHLSVDVSS PDLFGS IHEDFSLTSEQRLL rVLSNCCYLERHT 
FLN IAEHFEKHNFQGI EKITQVSMASLKBLDQRLFENYI ELKAD 
PIVGSLEPGI YAGYFDWKDCLPPTGVRNYLKEALVNI IAVHAEV 
FTISKELVPRVLSKVIEAVSEELSRLMQCVSSFSKNGALQARLE 
I CALRDT VAVYLTPES KSSF KQALEALPQLSSGADKKLLEELLN 
; KFKSS MHLQLTCFQAAS STMMKT 


6829 


1 


782 ~ m 


MRMEAGEAAP PAGAGGRAAGGWGKWVRLNVGGTVFLTTRQTLCR~~ 

EQKSFLSRLCQGEELQSDRDETGAYLIDRDPTYFGPILNFLRHG 

KLVLDKDMABEGVLBEAEFYNIGPLIRIIKDRM3EKDYTVTQVP 

PKHVYRVLQCQEEBLTQMV3TMSDGWRFEQLVNIGSSYNYGSED 

QAEFIiCWSKELHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 

EEVEVEQVQVEADAQEK/ CCYKPEAPGCEAPDHLQGLGVPI 


6830 


1 


939 


MEPGS VKNb s'lVYRS RJ!)Fli WNKHWD VR I DS KAWRETLTLQKQL 

RYRFPELADPDTCYGFRFCHQLDFSTSGLAIXTVALNK^ 

CFKERRVTKAYLALLRGHIQESRVTISHAIGRNSTEGRAHTMCI 

EGSQGCENPKPSLTDLWLEIIGLYAGDPVSKVLIjKPLTGRTHQL 

HVXHCSALGHPWGDIjTYGEVSGREDRPJRMMLHAFYLRIPTDT 

ECVEVCTPDPFLPSLDACWSPHTLLQSLEQLVQAIjRATPDPDPE 

DRGP RPGS PS ALLPG PGR P PPP PTKP PETEAQRGPdiQWLS E WT 

LEPDS 


6831 


3 


1087 1 


SLFFGSSTPDNKVAEQEDLETQPSPSVEKAVTVIDPEGTIPTNF 
WVAE KPADHSLS E VKLKTADEPRGTLVKSGDGQNVKEKSM I LSN 
VEDLQQ PKFI SE VSRED YGKKE I SGDS EEMN INS WTSADGENL 
EIQSYSLIGEKLVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
SIFKEEPRSDQKQKSLLSFDWDKVPOQPKSASSNFASKNITKE 
SEKPES I ILPVEESKGSL IDFSEDRLKKEMQNPTSLiKISEEETK 
LRSVSPTEKKDNIjENR\SYTLi\ABKKVLAEKQNSV\APLELRDS 
NE IGKTQITLGSRS TEL KES KADAM PQHFYQNEDYNER P KI I VG 


' 682ii> 


1809 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYtLV " 
VSLKECKRSEDDYEPIITYQFPKRBNLLRGQQEEEERLLKAIPLF 
CFPDGNEWASLTEYPRETFS FVLTNVDGSRKIGYCRRLLPAGPG 
PRLPKVYCIISCIGCFGLFSKILDEVEKRHQ1SMAVIYPFMQGL 
REAAFPAPGKTVTLKSFI PDSGTKFISLTRPLDSHLBHVDFSSL 
LHCL SFEQILQ I FASAVLBR XII FLAEGLSTLSQCIHAAAALL Y 
PFSWAHTYIPVVPESLIATVCCPTPFMVGVQMRFQQEVMDSPM2 
EVLLVNLCEGTFI*MSVGDEKDILPPKLQDDII»D3LGQGINBLKT 
AEQINEHVSGPFVQFFVKIVGHYASYIKREANGQGHFQERSFCK 
ALTS KTNRRFVKKFVKTQL FS L FIQEAE KS KNP PAG YFOOKILE 
YES Q KKQ / TETKG KNCE I RAWNKND I 


6833 


1 


1129 


PLMTLSQCGGIPGHGHSHGGHGHGHGLPKGPRVKSTRPGSSDIN " 
VAPGEQG PDQEETNTLVANTSNSNGL KLDPADPENPRSGDTVE V 
QVNGNLVREPDHMELEEDRAGQLNMRGVFLHVLGDALG5VIWV 
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Amino acid segment containing signal peptide " 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, Ialsoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine , 
P= Proline, Q=G lut amine , R=Arginine, 
SeSerina, T-Threonine, VoValine, 
W-Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /*possihle nucleotide deletion, 
\apossible nucleotide insertion) 








NALVFYFSWKGCSEGDFCVNPCFPDPCKAFV3IINSTHASVYEA 
G P CWVLYLDPTLCVVM VCILLY TTYPLLKESALILtiQTVPKQI D 
IRNLI KELRNVEGVEEVHELHVWQLAGSRI IATAHI KCEDPTSY 
MEVAKTI KDVFHNHGI HATT IQ PEFASVGS KS SWPCELACRTQ 
CALKQCCXSTIiPQAPSGKDAEKTPAVSISCLELSNNLEKKPRRTK 
AENIPA\WIEIKN\IPNK\QPESSL 


6834 


78 


lisl 


AGQERPAP lWRLLWLPTPSVSRKAEPAHiPINR*GA*E* RGGLP 
LCGSSASAYGWH* RLTPWSPGGS *HM* SSKAPVTQARE VLVAGP 
CS KLVLSG ARG I VGTT VQVLVEAQQP LLLL FTG VWGLNLRAGE E 

SRAL*LIEBVTQVRDAHLGNAWGCAQCLSQGQVGSALAKALLE 
AAAAVRDCKE VLTVSGDKOOAEVS t * VBnvfnnrT77i r» r^nr nv-iri 

AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVL5AHECHELVAG 
QQDGEDQAARTRLLQAGAHSVAHORRQGQAPCRPHQEAGVSCHE 
LQQWGDAL* ARB* APQI I VLLLLEDVAQLRTGKKA* DLWDVE 
QLLRQZi 


6835 


1 


834 


GIPAADR\EASLELIKLDiSRTFPNLCIFQQGGPYHDMLHSlLG 
AYTCYRPDVGYVQGMSFIAAVLILNLDTADAFIAFSNLLNKPCO 
' ^«- v ^-"iv»iH»jiji. xrA/vtii vrir tcNJjPKJjFAHF KKNNLTPDIYI, 
I n WlFTLYSJCSLPLDIiACRIWDVFCRIX3BFJFI»FRTALGIIiKLFE 
DILTKMDFIHMAQFLTRLPEDLPAEELFASIATIQMQSRNKKWA 
QVLTALQKDS REMREGKS VPPTLRLQRB FALGTNQS PMPRPLCC 
FRIiTPGQPRRTDAL 


6836 


l 


850 


mscgrpppdvdgmitlkv^dnltyrtspdslrrvfbkYgrvgdv 

YIPREP,HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 
QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRS YGRRSRS PR 
RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 
PYSRSRYRBSRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 
SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 
KSRSRSIO^PPKSPEBEGQMSS 


6837 


1 


1369 


tdgaavagnpgsdyfpggtap/ggprtrrp\sg¥sssgskasgp 

PNP PAQGDGTSLS PNYTLES TSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPGVS PGQQQAS GAAVGGS SAG BT 
RGAPTPHEKALTSPSWGKGAELLLGDOPDLIGSLDGnAK-^nQQQ 
PNVGEFASDEVSTSYANEDEVSSSSDNPQALVKASRSPLVTGSP 
KLP PRGVGAGEHGPKAP P P ALGLG IMSNSTSTPDS YGGGGGPGH 
PGTPGLEQVRTPTSSSGAPP PDB I HPI>EILQAQ IQLQRQQ FS I S 
EDQPLGLKGGKKGECAVGASGAQNGDS ELGSCCSEAVKSAMS TI 
DLDSIiMAEHSAAWYMPADKALVDSADDDKTLAPWEKAKPQNPNS 
KEAHDLPANKASASQPGSHLQCLSVHCTDDVGDAKARASVPTWR 
SLHSDISNRFGTFVAALT 


6838 


16 


499 


LTDTPPPKTHMIHHSISDYKATLRCWALGFYPMEITLTWQQDEE 
DQTRDMELVETRPAGDGTFQKWAAVWPSGEE/Q/RYMCHVQHE 
GLPEPLTLRWEQSSQPT IPI VG IVAGL VLLGA WTGAWSAVMC 
RKKNSDRVS YS EAAS5DHAQGSDVS LTACKV 


6839 


1 


1195 


AAPAGGGPDPEALSAFPGRHLSGl^ WPQVKRLDALLSEP IPIHG 
RGNF PTLS VQ PRQIRAGG PQHPGGAG \ IHVHR VRLHGS AASHVL 
HPESGLGYKDLDLVPRMDLRSEASFQIiTKAWLACLLDFLPAGV 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVELK 
FVDSVRRQFEFS IDS FQI I LDSLLLFGQCSSTPMSEAPHPTVTG 
ESLYGDFTBALEHLRHRVI ATRS PKEI RGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLBRYLEAHFGGAD 
AARRYACLVTLHRVVNESTVCLMNHERRQTLDIj IAALALQALAE 

QGPAATAAXiAWRP PGTDGV7P ATVNYYVTPVQP LIAHAYPTWL P 
CN 


6840 


4254 


2061 


EIjQGD FS VPD VPKSMAWCENS I CVGFKRDYYL IRVDGKGS I KEL 
FPTGKQ LEPLVAPLADGKVAVGQDDLT VVLNEEt»I CTQKGALNW 
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(A-Alanine, (^Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N^Asparagine , 
P= Proline, Q=Glutamine, RoArginine, 
S=Serine, T-Threonine ( V*Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TD I PVAMEHQP P Y I IAVLPRY VB I RT FE PRLLVQS 1 ELQRPR P I 
TSGGSNIIYVASNHFATWRLIPVPMATQIQQLLQDKQPEIiALQlA 
EMKDDSDSEKQQQIHHIKNLYAPNLFCQXRFDESMQVFAKLGTD 
PTHVMGLYPDLLPTDYRKOLOYPNPLPVTiSOAPr.Pvam ar rnv 

LTQKRSQLVKKLNI)SDHQSSTSPLMEGTPTIKSKKKLLQIIJDTT 
LLKCYLHTNVALVAPLLRLENNHCHIEESEHVLKKAHKYSELI I 
LYEKKGLHEKALQVLVDQSKKANSPLKGHBRTVQYLQHLGTENL 
HLIFSYSVWVLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 
FKGLAIPYLEHIIHVVJEETGSRFHNCLIQLYCEKVQGLMKEYLL 
SFPAGICTPVPAGEEEGELGEYRQKIiLMFLEISSYYDPGRLICDF 
PFDGLLEERALLLGRWGKHEQALFIYVH3LKDTRMAEEYCHKHY 
DRNKDGNKDVYIjSLLRMYLS pps ihclgp iklellepkanlqaa 
LQVLELHHS KI*DTTKAliNLLPANTQINDIR IFLEKVLEENAQKK 
RFNQVLKNLLHAEFLRV\QEERILHQQVKC I ITEEKVCMVCKKK 

ignsafarypngwvhyfcs\kevnpadt 


6841 


1 


3206 


TPS TTGTKSNTPTS S VPS AA^/T PLfclES Lq PLGDYG VGS KNS KRA 
REKRDSRNMEVQVTQEMRNVSIGMGSSDEWSDVQ0IIDSTPELD 

MC PETRLDRTGS s PTQG i vnkafg intdslyhelstagsevi gd 

VDEGADLLGEFSGMGKEVGNLLLENSQLLET KNALNWKNDLI A 
KVDQLSGEQEVLRGELEAAKQAKVKLENRIKEI^EELKRVKSEA 
I IARRE PKE EAEDVSS YLCTES DK I PMAQRRR FTR VEMARVLME 
RNQYKERLMELQEAVRWTEMIRASREHPSVQEKKKSTIWQFFSR 
LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 
SRPLBFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 
SLPAKYKQLSPMGGQEDTRMKNVPVPVYCRPLVEKDPTM1CLWCA 
AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATSSRVWILTSTLTTSKWI IDANQPGTWD 
QFTVCNAHVLC I S S 1 PAASDSDYPPGEMFLDSDVN PEDPGADG V 

LAG I TXiVGCATR fTMVPP o hp c ep inTrnn- -n vr*rsr* ct n^-n ▼ . 
»**»ww»»ftwn vjrxwsMuCroJ&uMJ I IrVliUKGQGEVATIANGKV 

NPSQSTEEATEATBVPDPGPSEPETATLRPGPLTEHVFTDPAPT 

PSSGPQPGSENGPEPnssSrRPEPEPSGDPTGAGSSAAPTMWLG 

AQNGmYVHSAVANWKKCLHSIKLKDSVLSLVHVK^RVLVALAD 

GTLAIFHRGEDGQWDLSNYHLMDLGHPHHSIRCMAWYDRVWCG 

YKNKVKVIQPKTMQ IEKS PDAHPRRESQ VRQLAWI GDGVWVS IR 

LDSTLRLYHAHTHQHLQDVDIEPYVSKMLGTGKLGFSFVRITAL 
LVAGSRLWVGTGNG WISI PLTBTWLHRC30\LLG \ LRANJfT*; P 
TSGEG\ARPGG\IIHVYG\DDSSDRAARSFIPYCSMAQAQLCFH 
GHRDAVKFFVS VPGN VLATLNGS VLDS P AEGPGPAAPAS EVEGQ 
KLRNVLVLSGGEGYI DFR IGDGEDDETEEGAGDMSQVKP VL S KA 
ERSHI I VWQVS YTPE 


6842 


3 


926 


KCOgLS ATI LTDHQYJjERTPLCAIIi KQKAPQQ YR I RAKLRS YKP 
RRLFQSVKIiHCPKCHLLQEVPHEGDLDriFQDGATKTPDVKLQN 
TCLYDSKIWTTKNQKGPJKVAVHFVKNNGILPLSNECLIiLIEGGT 
LSEICKLSNKFNSVIPVRSGHEDtiELLDLSAPFUQGTVHHYGC 
KQWST'RS I QNLNS IiVDKTS WI P SS VAEALGI VPLQ YVFVMTFT 
LDDGTGVLEAYLMDSDKFFQ I PASEVLMDDDLQKSVDMIMDMFC 
P PGIKI DAYPWLE C FIKS YNVTNGTDNQ I CYQ I FDTT VAEDVT 


6843 


2 


8S1 


NHRKVLSGAKKYECNECGKSFAYTSSLIKHRRIHTGERPYECSE 
CGRS FAENSSLI KHLRVHTGERP YECVE CGKS FRRS S SLLQHQR 

VHTRERPYECSBCGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
SRKSShl IHIiRVHTGERPYECSDCGKSFAENSSLlKHLR VHTGE 
RP YECIDCGKS FRHS S S FRRHQRVHTGMRPYK*S KFWKFS CPG F 
LLLQGQR VHTGSRCYECDKWG 1 FFS*NAS FFT* KS APTEEVP FE 
CNECEKA FS PLSLVTTI FT 


6844 


i44 - " 


642 


EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYI.S PQELEDVFY 
QYDVKSEIYSFGIVLWEIATGDIPFQGQTSEKIRKLVAVKRQQE 
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Amino acid segment containing signal peptide " 
(AaAlanine, OCysteine, D=»Aspartic Acid, Be 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I«Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Trypfcophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\«posaible nucleotide insertion) 








plgedcpski^eiidbcrahdpsvrpsvdeilkklstfsk*cik"' 
I 


6845 


3 


1513 


VAVR DB CYWRH VFWDQDLWMLLFI fiMc&P BTARARLB YRI RTliD 
GALENAQNLG YQGAKPAWESADSGLEVCP BD I YGVQB VHVNGAV 
GIAFBLYYHTTQDLQLFREAGGWDWRAVAEFWCSRVEWSPREE 
KYHLRGVMS PDEYHSGVNNSVYTNVEjVQNSIiRFAAALAQDLGLP 
I PS QWLAVAD K IKVP FDVEQNFHP EFDGYE PGEWKQADWLLG 
YP VP FSLS PD VRRKNLE I YEAVTS PQGPAMTWS MFAVGWMSLKD 
AVRARGI^DRSF7^MAEPFKVWTBNATX5SGAVNFLTGMGGFLQA 
WFGCTGFRVTRAGVTFDPVCLSGISRVSVSGIFYQGNKIjNFSF 
S EDS VTVB VTARAG P WAPHLEAELWPSQSRLS LLPGHKVS FPRS 
AGRIQMSPPKLPGSSSSBFPGRTFSDVRDPLQSPLWVTLGSSSP 
TESLTVDPASE*-SGTGASETSLGPSLWPRLHPPLLGTLIiACHPS 
PAARI^GKVHAAWPEFKAFCL 


6846 


213 


1258 


LYFLKTIK*LNKLAEHP*YENEKLTKLRNTIMEQYTRTEESARG~ 
1 1 FTKTRQSAYALSQW I TENEK FAEVG VKAHHL I GAGHS SEFKP 
MTQNEQKEV I S KFRTG K I NLL2ATTVAEEGLD I K ECN 1VIRYGL 
VTNEI AMVQARGRARAD ESTYVLVAHSGSGV I EHETVNDFREKM 
MYXAIHCVQNMKPEE YAHJCILBLQMQS IMEKKMKTKRNIAKHYK 
NNPSLlTFLCKNCSVLACSGEDTIIVTRKMmfUTaMTO-RTrTrwT vtv 
RENKTLQKKCADYQINGEI I CKCGQAWGTMMVHKGLDL PCLKIR 
KFVWFXNNS TKKQYKKWVELP ITFPNIDYS E CCLFS DED 


6847 


1450 


348 


SMCMNSDRLEMPLIDLALILYPPSYVPYTGHLSDDSLSRKYCLT 
WFEDALNGVL* RAEA I Q PHC VNAGDRMEKFRQKYWNKLQTLRQQ 
PFAYGTLTVR S LLDTREHCLNEENF PDPYS KVKQRENGVALRCF 
PGVVR5LDALGWEERQLALVXGLLAGNVFDKGAKAVSAVLESDP 
YFGFEEAKRKLQERPWLVDSYSEWLQRLKGPPHKCAL1FADNSG 
ID! I LGVFPFVREtLLRGTEVtliACNSGPALNDVTHSESLl VAE 
RIAGMDPWHSALREERLLLVQTGSS5PCLDLSRLDKGLAALVR 
ERGAJDLWI EGMGRAVHTNYHAAIiR C ES L KLAVI KNAWLAERLG 
GRLFSVIFKYEVPAE 


6848 


19 


16 


AMWWNSLDGIRNIVLSNPKKRNTLSLAMLKSLQSDILHDADSND " 
LKVI I ISAEGPVFSSGHDLKELTEEQGRDYHAEVFQTCSKVMMH 
IRNHPVP V I AM VNGLATAAGCQLVAS CD I AVASDKSS FAT PGVN 
VGLFCSTPGVAIiARAVPRKVALEMLFTGEPISAQEAtiLHGLZiNK 
WPEAELOEETMRrARKIASLSRPWSLGKATFYKQLPQDLGTA 
YYLTS Q AMVDWLALRDGQEGI TAFLQKR KPVWSHEP V* VEH 


6849 
€B5Q 


70 


021 


SLC^GSCLEQGSPAPRPQTDTSP*PVGNWATQQEDLYHQSYEC 
VCVLFASVPDFKEFYSBSNINHEGLECLRLLNBIIADFDELLSK 
PKFSGVEKIKTIGSTYMAATGLNATSGQDAQQDAERSCSHLGTM 
VE FAVALGSKLDVI NKHS FNNFRLRVG LNHGP WAGV1 GAQKPQ 
YD I WGNTVNVASRME STGVLGKIQVTEETAWALQS LG YTCYSRG 
VIKVKGKGQLCTYFLNTDLTRTGPPSATLG 




2 


1235 


ARGLNHEWT FB KIiRQHI SRNAQDKQE LHLFMIjSGVPDAVFDLTD 
LDVt»KLEI*IPEAKI PAKISQMTNLQELHLCHCPAKVEQTAFS FL 
RDHLRCLHVKFTDVAEIPAWVYLLKNLRELYIilGNLNSENNKMI 
GLESLRELRHLKILHVKSNLTKVPSNITDVAPHLTKLVIHNDGT 
KLLVLHSLKKWMNVAELELQNCELBR I PHAIFSLSNLQELDLKS 
NNIRTIEEI ISFQHLKRLTCLKLWHNKI VTIPPS ITHVKNLESL 
YFSKNKLESLPVAVFSLQKLRCLDVSYNNISMIPIEIGLLQNLQ 
HLH ITGNKVD IL P KQLFKC I KLRTLNLGQMCI TS LPEKVGQLS Q 
LTQLELKGNCLDRLPAQLGQCRMLKKSGLVVEDHLFDTLPLEVK 
EALNQDINIPFANGI 


6851 


1765 


660 


VSAQVSAREGENCLGWNLADSSQESYKSLEEAEDCYPPSLLTU> 
LRDLFNQVEQGPLLSCPKAGTDLSMGRARBVGWMAAGLMIGAGA 
CYCVYKLTIGRDDSEKLEEEGBEEWDDDQELDEEEPDIWFDFBT 
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MARPWTEDGDHTEPGAPGGTEDRPSGGGKANRAHPIKQRPFPYE " 
HKNTWSAQNCKNGSCVLMjSKCLFIQGKLLFAEPKDAGFPFSQD 
INS H LASLS WARNTS PTPDPT VREAL CAPDNLNAS I ESQGQIKM 
YINEVCRETVSRCCNS FIiQQAGLNLLISMTVINNMLAKS ASDLK 
FPLISEGSGCAKVQVLKPLMGLSEKPVLAGELVGAQMLFSFMSL 
FIRNGNRE J LLETPAP 


6852 


1 


4 07 


RTRGEETYANFIKHNDGKNIFYAARTPATLFAVMFAMYIISGLT " 
GFI GLNSIAVLCNLVMGLALI FLCTWAYVKYSGEFREIGTVIDQ 
IAETLWEQVIiKPLGDNIjMEENIRQS VTNS IKAGLTDQVS HHARL 
KTD 


6853 


3 


469 


GDSCAVCIELYKPNDLVRlLa'C^lF^^CVDPWIiEHR'TCPMC 
KCD I LKALGIEVDVEDGSVSLQVPVSNE IFNSASSHEBDNRSET 
ASSGYASVQGTYEPPLEEHVQSTNESLQLVNHEANSVAVDVIPH 
VDNP TFE ED ETPNQETAVRE I KS 


6854 


1148 " 


585 


HES Y IGTFD PGELCVCAAIQWLQDNS AS YFLNRKL V YE P STQ AK " 
P VKNTFLRMW I Y5HHI YQQDIiRKKI LDVG KR I»D VTGFCMTGKPG 
I ICVEX3FKEHCEBFWHTIRYPNWKHISCKHAESVETEGNG2DLR 
LFHSFEEIJiLEAHGDYGLRNDYHMNLGQFLEFLKKHKSEHVFQI 
LFGIESKSSDS ~ ! 


6 BBS 


1913 


1148 


GRVGGRVGRICSPI^GANEYIASTDTLKTEEVLLFTDQTDDLAK""' 
EEPTSIiFQRDSETKGESGLVLEGDKEIHQlFEDLDKKLALASRF 
YIPEGCIQRWAAEMVVAtPAiaREGIVCRDIiNPNNILLNDRGHI 
QLTYFSRWSE VEDSCDSDAI ERMYCAPEVGA I TEETE ACDWWSL 
GAVLFELLTG KTLVECH PAG I NTHTTLNMPEWVSE E ARSLI QQL 
LQFNPLERLGAGVAGVEDIKSHPFFTPVDWAELMR 


SB56 


1617 


' 997 


VTQLYVSVDASTKDSLKKIDRPLFKDFWQQFLDSLKALAVKQQR 
TVYRLTLVKAWNVDELQAYAQLVSLGNPDFIEVKGVTYCGESSA 
SSLTMAHVPWHEEWQFVRELVDLI PEYEIACEHEHSNCLLIAH 
RKFK I GGE WWTWI NYNRFQEL IQB YEDSGGS KTFS AKDYMARTP 
HWAIiFGAS ERGFDPKDTRHQRKNKS KAI SGC 


6857 


1 


617 


KGPEATAMVCVCSHPNC^QNHiKpStiSAAQTttCGSPTPASAPNH 
KLMAMEQGKTLPSATEDAKEEGLEAQISRLAEL IGRLESKALWF 
DLQQRLSDEDGTNMHIjQLVRQEMAVCPEQLSEFLDSLRQYLRGT 
TGVRNCFHI TAVRLSIX3FTFVI YEFWETEEAWKRHLQSPLCKAF 
RHVKVDTLSQPEALSRILVPAAWCTVGRD 


6858 


2 


669 


RSRGIKDFENDPPLSSCGIFQSRIAGDALLDSGIRISSVFASPA ~ 
LRCVQTAKLILEELKLEKKIKIRVEPGIFEWTKWEAGKTTPTLM 
SLEELKEANFNIDTDYRPAFPLSALMPAES YQE YMDRCTAS MVQ 
1VNTCPQDTGVILIVSHGSTLDSCTRPLLGLPPRECX3DFAQLVR 
K I PS LGMCFCE BNKEEGKWEIiVNPPVKTLTHGANAAPN WRNW I S 
GN 


6B59 


1 


1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDI IQSPSSTGLLKSG '"" 
KTNS VBSLPELLTSDSEGS YAGVGS PRDLQS PDFTTGFHSD KIE 
AKVKP YVNGTS P VYS REDLK P WEKS P I LKI SAPQP I P SNR I DTT 
SSASWVAGSFS P VSPPWDLRTIMB I EESRQKCGATPKSHLGKT 
VSHGVKLSQKQRKMIALTTKEmSGMNSMBrVLFTPSKAPKPVN 
AW AS S LHS VS S KS FRD FLLE EKKS VTSHSSGDHVKKVS FKG I EN 
SQAP KI VRCS THGTPGPEGNHIS DLPLLDS PNPWLSSS VTAPSM 
VAP VTFAS I VEEE LQQEAAL IRSRE KPLALIQI BEHAI QDLLVF 
YEAFGNPEEFVIVBRTPQGPLAVPMWNKHGC 


6860 


1889 


IblS 


DKDKKRQKKRG I F PKVATN IMRAWLFQHLTHP YPS EEQKKQLAQ " " 

DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 

MGSFVLDGQQHMGIRPAGPMSGMGMNMGMDGQWHYM 


6861 


1889 


1515 


D KD KKRQKKRG I FP KVATN IMRAWliFQH LTHP YPSEEQ KKQLAQ 
DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSOGAAYSPEGQP 
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Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGSFVLDGQQHMGIRPAGPMSGMGMNMGMDGQWHYM 


6862 


2 


471 


EEIDREFHNKLKLKEDKLEKQBKPVNGEDKGDSGVDTQNSEGNA ' 
DE EDPLG PNCYYDKTKS FFDN I S CDDNRERRPTWAEERRLNABT 
FG I PLR PNRGRGGYRGRGGLG FRGGRGRGGGRGGTPTAPRGFRG 
GFRGGRGGRE FAD FE YRXTTAFG P 


6B63 


2216 


487 


PQEPALKSEFSQVASNTIPIiPLPQPNTCKDNGPCKQVCSTVGGS 
AICSCFPGYA1MADGVSCEDQDECLMGAHDCSRRQFCVNTLGSF 
Y C VNHTVLCADG Y I LNAHRKCVDI MECVTDLHTCSRGEHCVNTL 
GSFHCYKALTCEPGYALKDGECEDVDBCAMGTHTC2QPGFLCQNT 
KGSFYCQARQRCMDGFLQDPEGNCVDINECTS1»SEPCRPGFSCI 
NTVGS YTOQRNPLI CARGYHASDDGTKCVDVNECETGVHRCGEG 
Q VCHNL PGS YRCDCKAG FQRDAFGRGCIDVNECWAS PGRIiCQHT 
CENTLGSYRCSCASGFLLAADGKRCEPVNECEAQRCSQECANIY 
GSYQCYCRQGYQLAEDGHTCTDIDECAQGAGILCTFRCLNVPGS 
YQCACPECK5YTMTANGRSCKDVDECALGTHNCSEAETC3JNIQGS 
FRCLRFECP PNYVQ VS KTKCERTTCHDFLECQNS PAR I THYQLN 
FQTGLLVPAH I FRI G PAPAFTGDT I A1NI I KGNE EG Y FGTRRLN 
AYTGWYLQRAVLE PRD FALDVEMKLWRQGS VTTFLAKMHI FFT 
TFAL 


6B64 


2 


2933 


LADSS PSNLQ 1 1 IKELLSMHHQPDPALTKEFDYLPP VDSRS SSG 
FVGLRNGGATCYMNAVFQQLYMQPGLPESLLSVDDDTDNPDDSV 
FYQVQSLFGHIJflESKLQYWPENFWKIFKMWNKELYVREQQDAY 
EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKICKDCPHRY 
EREEAFMALNLGVTSCQSLE I SLDQ FVRGEVLEGSNAY YCE KCK 
EKR I TVJOtTC I KS LPSVLVT HLMRFG FD W ESGRS I KYDEQ I RFP 
WMLNMEPYTVSGMARQDSSSEVGKNGRSVDQGGGGSPRKKVALT 
ENYELVGVIVHSGQAHAGHYYSFIKDRRGCGKGKWYKFNDTVIB 
EFDLNDETLE YECFGGEYR P JCVYDQUTPYTDVRRRYWNAYM LFY 
QRVSDQNSP VLPKKSRVSWRQEAEDLSLSAPSS PBI 3PQSSPR 
PHRPNNDRLSILTKLVKKGEKKGLPVEKMPARIYQMVRDENLKF 
MKNRDVYSSDYFSFVLSLASLNATKLKHPYYPCMAKVSI^JliAIQ 
FLFQTYLRTKKKLRVDTBEWIATIEALLSKSFDACQWLVEYFIS 
SEGRELI KI FIXE CNVRBVRVAVATILB KTLDSALF YQDKLKS L 
HQLLEVLLALLDKDVPENCKNCAQYFFLFNTFVQKQG I RAGDLL 
LRHS A1»RHM I S FLLG AS RQNNQ I RRWS S AQARE FGNLHNTVALL 
VLHSDVS SQRNVAPG I FKQRPP I S IAPS SPLLP&HEE VEALLFM 
SEGKPYLLEVMFALRELTGSLLALIEMWYCCFCNEHFSFTMLH 
FIKNQLBTAPPHELKNTFQLLHSILVIEDPIQVERVKFVFETBN 
GLLAIiMHHSNHVDSSRCYQCVKFLVTLAQKCPAAKEYFKENSHH 
WSWAVQWriQKKMSEHYWTLQSNVSNETSTGKTFQRTISAQDTLA 
YATALLNEKEQSGSSNGSESS PANENGDKHLQQGS E S PMMI GE L 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKUTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
LDQRAAD YNQALGTCRLAGTALCVAAGVLLAICLFWAM IGWLS Q 
DTKAEPLDPEADSHVEVFGDEPEQQLSPIFRNASGQSWFSPPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


DCPRPRYTLYGLRATCMRDLDWAWINAVSAFXALEQDLPVNIKF 
IIEGMEEAGSVALEELVEKEKDRFFSGVDYrVISDNLWISQRKP 
AIT YGTRGNS YFMVE VKCRDQDFHSGTFGG ILHE PMADLVALLG 
SLVDSSGHI LVPGI YDEWPLTEEEINTYKAIHLDLEE YRNSSR 
VEKFLFDTKEEILMHLWRYPSLS IHGIEGAFDEPGTKTVIPGRV 
IGKFS IRLVPHMNVSAVEKQVTRHLEDVFSKRNSSNKMVVSMTL 
GIoHPWIANIDDTQYLAAKRAIRTVFGTEPDMIRDGSTIPIAKMF 
QE I VHKS WLI PLGAVDDGEHSQNEKINRWNYI EGTKLFAAFFL 
EMAQLH 
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68*7 


2833 


1704 


GTRIMSQPKQKELAQPVRQKMLLDYSVYMGRCVPQESRSPORSP " 
LQSABSSPTAGKKLPBVPPSEEEEQEAWVNALLGRIFWDFLGEK 
YWSDLVSKKIQMKLSKIKLPYFMNELTLTBLDMGVAVPKILQAP 
KP YVDHQGLW I DLEMSYNGS PLMTLETKMNLTKLGKEPLVEALK 
VG E I G KEGCRPRAFCLADS D EESS S AGS S E ED DAPEPSGGDKQL 
LPGAEGYVGGHRTSKIMRFVDKITKSKYFQKATETEFIKXKIEEx 
VS NTPLLLTVEVQ BCRGTLAVNI p P PPTDRVW YGFRKP PHVELK 

ARPKLGEREVTLVHVTDWIEKKLBQEFQKVFVMPNMDDVYITIM 
HSAMDPRSTS CLLKDPP VBAADQ p 


6868 


1 


346 


RPTRPPTR PEE I KNL ILP Y I SDMNF VQDLCED F YELFKTDKGFD 
KATFESQMSVMRGQILNLTQALRDGKSPFQLVQIPCVIVERSQG 
GSQGRI VHLSNS FTQTVNCRKPFPSSW 


6869 


3 


1619 


MYMERMDKRALISFWESVEHLKNANKNEIPQIjVGEIYQNFFVES ' 
KE IS VE KSLYKE I OOCLVGNKG T R VVYK T rtPnWFT r vnr> wn =? 

FIVSDLYEKLLI KEEEKHASQMISNKDEMGPRDEAGBEAVDDGT 
MQINEQASFAVNKLRELNEKLEYKRQALNSIQNAPKPDKKIVSK 
LKDEIILIEKERTDLQLHMARTDMWCENLGMWKASITSGEVTEE 
NGEQLPC YFVMVSLQE VGGVETKNWTVPKPXS E FHNLHRXLSEC 
VPSLKKDQLPS LSKLP PKS I DHTFMEKFENQLNKFLQNIiLS DER 
LCQS E AL YAFL S PS PDYLKVI DVQG KKNSFSLS S FLERIiPRDFP 
SHQEE ETEEDS DliSD YGDDVDGRKDALAEPCFMLIGE I FE LRGM . 
FKWVRRTlil ALVQVTFGRT I N KQ I RJDT VS WI FS EQMLVY YI NI F 
RDAFWPNGKLAPPTTIRSKEQSQETKQRAQQKLLENIPDMLQSIj 
VGQQNARHGIIKIFNALQETRANKHLLYALMELLLIELCPELRV 
HLDQLKAGQV 


6870 


1 


1566 


MAAVVAATRW WOLLLVLS AAGMGA ^fi APnb put tY't t MnnMPup — 
DLGVYGEPSRETPNLDRMAAEGLLPPNFYSANPLCSPSRAAIjLT 
GRLPIRNGFYTTNAHARNAYTPQEIVGGIPDSEQIiLPELLKICAG 
YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NI PVYRDWEMVGR YYEEFP INI>KTG EANLTQI YLQEALD FI KRQ 
ARHHP FFL YWAVDATHA P VYASKP FLGTS QRGR YGDAVR E I DDS 
I GKILELLQDLHVADNTFVFFTSDNGAALIS APEQGGSNG P FLC 
GKQTTFEGGMREPALAWWPGHVTAGQVSHQLGS IMDLFTTSLAL 
AGLTPPSDRAIDGLNLLPTLLQGRLMDRPI FYYRGDTIiMAATLG 
QHKAHFWTWTNSWENFROGIDFCPGQNVSGVTTHNLEDHTKLPL 
IFHLGRDPGERFPLSFASAEYOEALSRITSWQQHQEALVPAQP 
QLNVCNMAVMNWAPPGCEKLG KCLTPPES I PKKCLWSH 


6871 


209 


1126 


RMS JjNP P I FLKRSEENS SKF VETKQSQTTS IAS ED PLQNLCLAS 
QEVLQKAQQSGRSKCLKCX5GSRMFYCYTCYVPVENVPIEQIPLV 
KLPLKIO 1 1 KHPNETDGKSTAI HAKLLAPE FVNI YT YPCI PE YE 
EKDHEVALI FPGPQSI S IKDI S FHLQKR IQNNVRGKNDDPDKPS 
FKRKRTEEQBFCDLNDSKCKGTTLKKI I FIDSTWNQTNKIFTDE 
RLQGLLQVELKTRKTC FWRHQKGKPDTFLST I EAI Y YFLVD YHT 
DILKEKYRG QYDMLLFFYS FMYQLI KNAKCSGDKETGKLTH 


6872 


880 


459 


FGLLMVVLS LIFMKGNCVREDLI FNFLFKLGIiDVRETNGLFGNT 
JCKLITEVFVRQKYLEYRRI P YTEPAE YEFLWGPRAFIiETSKMLV 

LRFLAKLHKKDPQSWPFHYLEALAECEWBDTDEDEPDTGDSAHG 
PTSRPPPR 


6873 


1929 


955 


DEQAVIiCSKDKTYDIiKIADTSNMLliFIPGCKTPDQLKKEDSHCrf 
IIHTEIFGFSNNYWELRRRRPKLKKLKKLIjMENPYEGPDSQKEK 
DSNSSKYTTEDLLDQIQASEEEIMTQLQVLNACKIGGYWRILEF 
DYEMKlXNHVTQLVDSBSWSFGKVPLNTCLQELGPIiEPEEMIEH 
CLKCYGKK YVDEGEVYFELDADKI CRAAARMLLQNAVKFNLAEF 
QEWCXJS^EGMVTSLDQLKGLALVDRHSRPEIIFLLKVDDLPE 
DNQERFNSLFSLREKWTEEDXAPYIQDLCXSBKQTIGALIiTKYSH 
SSMQNGVKVYNSRRPIS 
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6874 


X 


307 


DSIADHVNSAAVNVEEGTKNIX3KAAKYKLAALPVAGALIGGMVG 
GPIGLLAGFKVAGIAAALGGGVIXSPTGGKLIQRKKQKMMEKLTS 
SCPDLPSQTDKKCS 


6875 


1688 


349 


Vl GTGERGNS AS B KWE I MFNEELG DP FI 1 1 HS I S LLNAEEHSIA 
TLLLRIEKEELDMKGSGFyVSIiEWVTISKKNQDNKKYEIIKRDI 
LRGKSVPHYAAIEPEGNGLMIVSYKSLTFVQAGQDLEENMDEDI 
SBKIKEPLYYWQQTEDDLTVTIRLPEDNTKEDIQIQFLPDHINI 
VLKDHQ FL EGKL YSS IDHE SST W 1 1 KESNSLE I S LI KKNEGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKEKPP 
CNAQELEECDI FFEESSSLCRFDGMTiKTTOVVNLGSNQYIiFSV 
IVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
QASKRDKKFFACAPNYSYAALCECLRRVFIYRQPAPMSTVLYNR 
KEGRQVGQVAKOOVAS LETNDPI LGFOATNRPT. PUr .TTTfWT .vrr r 
KVWTEN 


6876 


41 


1285 


VGEMTLIWRHLLRPLCLVTSAPRILEMHPFLSLGTSRTSVTKLS 
LHTKPRMPPCDFMPERYQVIFLVNSGSEANEIAMLMARAHSNNI 
DI IS FRGAYHGCSPYTLGLTNVG I YKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDS PVQTIRKCSCAPDCCQAKDQY IEQFKDTLSTS 
VAKS IAGFFAE PIOGVNG WOYPKGFLXEAPm .VP&Pfyivr»T aw 
E VQTGFGRLGSHFWG FQTHDVLPDI VTMAKG IGNGFPMAAVT TT 
PEIAKSIAKCLQHFNTFGGNPMACAIGSAVLEVIKEENLQBNSQ 
EVGT YMLLKFAKLRDB FE I VGD VRG KGLMIGI EMVQDKIS CR PI> 
PREEVNQIHEDCKHMGLLVGRGSIFSQTFRIAPSMCITKPEVDF 
AVEVFRSALTQHMERRAK 


6877 


1 


778 


GTS PS PARAYAPPTERKRFYQNVS I TQGEGGFE INtiDHRKLKTP 
QAKLFTVPSEALAIAVATEWDSQQDT I KYYTMHLTTLCNTSLDN 
PTQRNKDQLIRAAVKFLDTDTICYRVEEPETLVELQRNEWDPI I 
EWAE KRYGVE I S S 5 TS IMG PS I PAKTRE VLVSHLAS YNTWALQG 
IEFVAAQLKSMVLTLGLIDLRLTVEQAVLLSRIiEEEYQIQKWGN 
IEWAHDYELQELPJVRTAAGTLFIHLCSESTTVKHKLLKE 


6878 


931 


263 


qti^gdfknraemidfwiriknvtRsdaSkyrcevsapseqgqn 
leedtvti^/l vapavps cevpss alsgtvvelrcqdkbgn pap 
eytwfkdgirllenprlgsqstnssytmntktgtlqfntvskld 
tge ys cearns vgyrrcp gkrmqvddlni sg i iaawwalvis 
vcglg vc yaqr kg yfs kets fqksns s skattms endfkhtksf 

II 


6879 


3 

i 


845 


IRVIGESDIMOEFLSESDEWYNGVSDVBLRVALPDGTTVTVRVK " 
KNSTTDQVYQAIAAKVGMDSTTVNYFALFEVI SHSFVRKLAPNE 
PPHKLY IQNYTSAVPGTCLTIRKWLFTTEEE I LLNDNDLAVTYF 
FHQAVDD VKKGY I KAE BKS YQLQKL YEQRKMVM YLNMLRTCEGY 
NE 1 1 F PUCACDSRRKGHVI TAJ S ITHFKLHACTEEGQLENQVIA 
FEWDEMQRWDTDEEGMAFCFE YARGEKKPRWVKI FTPYFNYMHE 
CFERVFCEL KWRKEEY 


6880 


2110 


1437 


RKDNCTAKEW TFP EAKWNTTARVFS HI RLGMGHVLI I VQ CF I SS 
MANI YNEKILKEGNQLTES I FIQNSKLYFFGILFNGLTLGLQRS 
NRDQ I KNCGF FYGHRAFS VAL I FVTAFC3GLSVAFI LKFLDNMFH 
VLMAQVTTVI ITTVS VLVFDFRPSLEFFLEAPSVLI/SIFI YNAS 
KPQ VPE YAPRQER I RDLSGNLWERS S GDGBELERLTKPKSDESD 
EDTF 


6881 


2638 


2244 


NDSKWEDI HVITGALKMFFRELPEPLFTFNHFNDFVNAI KQEPR 
QRVAAVKDL IRQIi P KPNQDTMQI LFRHLRRVI ENGEKNRMT YQS 
IAIVFGPTLLKPEKETGNIAVHTVYQNQIVELILLELSSIFGR 


68B2 ! 


1 


850 


GIPEAQLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWIjVIN 
QBG^VTARQBPRLVLISLTCDGDTLTLSAAYTKDLLLPIKTPT 
TNAVHKCRVHGLEIEGRDCGEATAQWITSFLKSQPYRLVHFEPH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLILSEASLADIiNSRLEX 
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KVKATNPRPNIVISGCDVYAEDSWDBLLIGDVKLKRVMACSRCI 
LTTVDPDTGVMSRKBPLETLKSYRQCDPSBRKLYGKS PLFGQYF 
VLENPGTIKVGDPVYLLGQ 


6883 


2794 


2256 


NSKLKLNQNLKLFITLTYQVLSLHGWGPGIHLQKBGAFPVTQNR 
ALQLLYDLRYLNIVLTAKGDBVKSGRSKPDSRIBKVTDHLEALI 
DPFDLDVFTPHLNSNLHRLVQRTSVLFGLVTGTENQLAPRSSTF 
NSQEPHNILPLASSQIRFGIjLPLSMTSTRKAKSTRNIETKAQYD 
ANC 


6884 


2 


S3 


BFBRVTAEAVKPRETSEPRAAAQRFCEKFPFJL 


6885 


297 


1554 


STGQFWHVTDLHtDPTYHlTDDHTKVCASSKGANASNPGPFGDV 
LCDSPYQLILSAFDFIKNSGQEASFMIWTGDSPPHVPVPELSTD 
TVINVITNMTTTIQSI*FPNLQVFPALGNHDYWPQDQLSWTSKV 
YNAVANL WKP WLD E EA I S TL RKG G FYSQ ECVTTNPMLR IIS LNTN 
LYYGPNIMTLNKTDPANQ FEWLBSTLNNSQQNKEKVYI I AHVPV 
GYLPSSQNITAMREYYNEKLIDI FQ KYSDVIAGQPYGHTHRDS I 
MVL S DKKGSPVNS LFVAPAVTPVKS VLEKQTNNPG IRLFQYDPR 
DYKLLDMLQ YYLN LTE ANLKGES I WKLE Y ILTQTYD I ED LQPES 
LYGLAKQFTILDS KQFI KY YNYFFVSYDSSVTCDKTCKAFQI CA 
J.PUNL>DNI£> XAUCIjKQIjYI KHNY 


6896 


2 


1341 


QCGGI PGREGGSS RPLEEGTGSS PACVRGAAPGSEDAFY PTRAK 

QARVSQE LKKAAKRTVS I S EGPDTLGDGMRERRETLALAP BP E P 

LEKEACEKWKRPFRSASATSLTLSHCVDWKGLLDFKKRRGHS I 

GGAPEQRYQIIPVCVAARLPTRAQDVLDAHLSEVNAVRFGPNSS 

LLATGGADRLIHLWNV VGSRLEANQTLEGAGG3 ITS VDFDPSGY 

QVIiAAT YNQAAQLWKVGEAQS KETLS GHKDKVTAAKFKLTRHQA 

VTGSRDRTVKE WDLGRAYCS RTINVLS YCNDWCGDHI I ISGHN 

DCiKlRTJ'WnQPrtDUf'TriX/TtivnrsDTTTCT.CT.QuriOT tjt t croTirwmi 
W«*»»r nwoftw rnviyvirv \j_ WK V X o l>c> Li iJrLU yj hri L>L> o LoKDNT 

LKVIDLRVSNIRQVFRADGFKCGSDWTKAVFSPDRSYALAGSCD 
GALYIWDVDTGKLESRIiQGPHCAAVNAVAWCYSGSHMVSVDQGR 
KWLWQ 


6887 


1047 


116 


WTARPS QK PFW EAGAVPGDPLS TGC3 QAQLGG CC PRGP WGPQHG 
GCX1RAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRSQS PRSPAGP FRGGTGWW PEPAVCLCVAVGPQRLS SPGLVY 
NASGSEHCYDIYRLYHSCADPTGCGTGPDARAWDYQACTEINLT 
FAS NNVTDMFPDLP FTDELROR YCLDTWGVWP'R ptiwt . T .T<5 vwrzrz 

DLRAASNIIFSNGNLDPWAGGGIRRNLSASVIAVTIQGGAHHLD 
LRAS H P E DP AS WE AR KLE AT 1 1 G EW VKAARREQQ P ALRGGP RL 
SL 


6888 


1 


992 


FVAWKKEIPHIWTHCLLNPHALViK^LPT^RDALFTWRVI 
NFIKGRAPNHPJLFC^FFEEIGIEYSVLLFHTEMRWLSRGQILTH 
IFEMYEEINQFI^HKSS^VDGFENKEFKIHIAYIJulLFKHLNE 
LSASMQRTGMNTVSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
EEIIVSDNEGIFIAAEITLHLQQLSNFFHGYFSIGDLNEASKWI 
LDPFIaFNIDFVDDSYLMKNDLABIiRASGQILMBFETMKIiEDFWC 
AQFTA FPKLAKTALE I LMP FATTYbCELG FS I TFTFQNKVPEAA 
LILSDDIRVAISKKVPSFLGHH 


6889 


1 


1534 


LTL2NQI KB EREQ DNSES PNGRTS PLVSQNNEQGSTLRDLLTTT 
AGKLRVGS TDAGIAFAPVYSMGAPSSKSGRIWPNIIiDDI IAS W 
ENKIPPSKTSKINVKPELKEEPBESII5AVDENNKLYSDIPHSW 
ICEKHILWLKDYKNSSNWKLFKBOTKQGQPAVVSGVHKKMNISL 
WKAES 1 S LD FGDHQADLLNCKDS I ISNANVKE FWDGF E B VS KRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYKDLLKSLPLPEYCNP 
EGKFNLASHLPGFFVRPDLGPRLCSAYGVVAAKDHDIGTTNLHI 
EVS DWNI LVYVG I AKGNG I LS KAG I LKKFE E EDLDD I LR KRLK 
DSSEIPGALWHIYAGKDVDKIREFLQKI3KEQGLEVLPEHDPIR 
DQSWYVNKKLRQRIjLEEYGVRTWTIjIQFLGDAIVIJ>AGAIiHQVO 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to Eirst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C-Cysteine, D=Aspartic Acid, 8= 
Glutamic Acid, ^-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M«Methionine, NaAsparagine, 
P»Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, v» Valine, 
V=Tryptophan, Y«Tyrosine, X=tTnknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NFHSCIQVTEDFVSPEHLVESFHLTQEIiRLLKEEINYDDKLQVK 
NILYHAVTtEMVRALKIHEDEVDDMEEN 


6890 
£891 


3 


667 


THACGMWI PLYTjHRALVVHKTABTCNS p p CGAKDS L I PGAITC F 
TG FLGVDTGAG ATRW CRLKTQRADPLVCAVGMLGSA IFICLIFV 
AAKSS I VGAYI CI FVGETLLFSNWAITAD ILM WVI PTRRATAV 
ALQS FTSHLLGDAGSPYL IGFI SDLIRQSTKDSPLWEFLSLGYA 
LMLCPFVWI^MFFliATALFFVSDRARAEQQVNQLAMPPASVK 




1980 


1262 


LRIHQELLSKELKLIAGITIESIIKIGIJUVGKEQFMQDASNV^5~ 
LLLKTQSHLYllMED3tWPEVROAAAYT5Lf^VManpnr2nnvD ct rcu 
AVPLL VK V I KRAHSKTKKNVI ATENCI S AIGKI LKFKPNCVNVD 
B VL PH WLS WL PLHEDKEEAI QTLS FLCDL I ESNHP WI G PNNSN 

LPKXISIIAEGKXNETINYEDPCAKRLANWRQVQTSEDLWLBC 
VSQLDDEQQEALQBLLNFA 


6892 


3 


876 


RSVAAASGPGAWGTDHYCLELLRKRDYEGYLCSliLLPAESRSSV 
FALRAFNVELAQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 
QPVAI ELWKAVKRHNLTKRWLMKIVDERETCNLDTiK'ivYPiJT itpt v 
NYAENTQSS LL YLTLE ILG I KDLHADHAASH I GKAQG IVTCLRA 
TPYHGSRRKVFLPMDICMLHGVSQEDFLRRNQDKNVRDVIYDIA 
SQAHLHLKHARSFHKTVPVTCAFPAFLQTVSLEDFLKKIQRVDFD 
IFHPSLQQKNTLLPLYLYIQSWRKTY 


6893 
6 894— 


1 


842 


DGERKSMS VERTFS EINKAEEQYSL CQELCS EIiAQDLQKli RL KG 
RTVT I KLKNVNFE VKTRAS T VSS WS TAKE I F AI AKELLKTE ID 
AD FPHPLRLRLMGVR 1SSFPNEEDR KHQQRS 1 1 GFLQAGNQALS 
ATBCTLEKTDKDKFVKPLEMSHKKSFFDKKRSERKWSHQDTFKC 
EAVN KQS FQTSQPFQVLKKKMNENLE ISBNSDDCQII*TCPVCFR 
AQGC1SLEALNKHVDECLDGPSISENFKMFSCSHVSATKVNKKB 
NVPASSLCEKQDYEAH 


" 6895 


1742 


1463 


TTLCKPLVPREHQFYETLPAEMRKFTPQYKGKSQLLEGLPHWRG 
DVRDRGHGRPWQPSLEPSLPPTLCFPSLSSFSSSWPSAQMLTPS 
VFNPW 




23 79 


478 


VTY VELCDIASPTALLIMRTVLDLI VEDLQS TSEDKEQQYTS QT" " 

rRLTJU^L YALASHKACKIAILHLINGTI KGDERYAEI FQDLLAL 

VRSPGDSVIRQQCVEYVTSILQSLCDQDIALILPSSSEGSISEL 

EQLSNS LPNKELMTS I CDCLLATLANSESSYNCLLTCVRTMMFL 

AEHD YGLFHLKSSLRKNSS ALHS LLKR WSTFSKDTGELAS S FL 

E FMRQ I LNS DT IGCCGDDNGLMEVEGAHTSRTMS I NAAELKQLL 

QSKEESPENLFIjELEKLVLEHSKDDDNLDSLLDSWGLKQMLES 

SGDPLPLS DQD VEP VLSAPESLQNLFNNRTAYVLAD VMDDQ h KS 

MWFTPFQAEEIDTDLDLVKVDLIBLSEKCCSDFDLHSELERSF1, 

SEPSSPGRTKTTKGFKI^KHKHETFITSSGKSEYIEPAKRAHW 

PPPRGRGRGGFGQGIRPHDIFRQRKQNTSRPP3MHVDDFVAAES 

KEWPQDGIPPPKRPLKVSQKI3SRGGFSGNRGGRGAFHSQNRF 

FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 

PLPPLRPLSSTGYRPSPRDRASRGRGGLGPSWASAN3GSGGSRG 

&rvbGG5GRGRHVRS FTR 


6896 


1 


555 


GNIVIQKKKYNKQHI IPLENVTIDSIKDEGDLRNGWLlKTPTKS 
FAVYAATATEKSEWMNHrNKCVTDLLSKSGKTPSNEHAAVWVPD 
S EATVCMRCQKAKFT P VNRRHHCRKCGFVVCGPCSE KRFLLPSQ 
S S KPVR I CDFC YDLLS AGDMATCQPARSDS YS QSLKS P LNDMS D 
DDDDDDSSD 


6897 


3 


920 


GDGLMHEVWGLMERPDWETAIQKPLCSLPAGSGJJAliAASLNHY 
AGYEQVTNEDLLTNCTLLLCRRLLSPMNLLSIlHTASGLPJLF^VL 
SLANG F I AD VDLESBKYRRI/3EMRFTLGTFLRLAALRT YRGRLA 
YLPVGRVGSKTPASPVVVQQGPVDAHLVPLEEPVPSHWTVVPDE 
DFVLVLAL LHSHLGS EMFAAPMGRCAAGVMHLFYVRAG VSRAML 
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Tn 

JLU 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A=>Alanine, C=Cysteine, CAspartic Acid, E= 
Glutamic Acid, F -Phenylalanine, G=Glycine, 
H«Hietidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q«=Glutamine, RoArginine, 
SaSerine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X-Unknown, *=Scop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








LRLFLAMEKGRHMEYECPYLVYVPWAFRLEPKDGKGVFAVDGE 
LMVSEAVQGQVHPNYFWMVSGCVEPPPSWKPQQMPPPEEPL 


6898 


919 


346 


QKTVTAVAS I>I*KGRCGI YTENERRMGAVI KIRFFKIMLVLI ICW" 
LSNI INESLLFYLEMQTDINGGSLKPVRTAAKTTWFIMGILNPA 
QGFLLSIAPYGWTGCSLGFQSPRKEIQWESLTXSAAEGAHPSPI, 
MPHENPASGKVSQVGGQTSDEALSMLSEGSDA3TIEIHTASESC 
NKNEGDPALPTHGDL 


6899 


120 


827 


M KVR KNNDAYLLDKNKI NMDCF I S CFFKKMLTTL»MFS HSG I JjS L 
LEHGEEYTFSLPCAYARS I LTVPWVELGGKVS VNCAKTGYSAS I 
TFHTKP FYGGKLHRVTAE VKHNI TNTWCRVQG EWNS VLE FTYS 
NGETKYVDIfTKLAVTKKR VRPLE KQDPFESRRLWKNVTDSIiRES 
E IDKATEHKHTLEERQRTEERHRTETGTPWKTKYFIKEGDGWVY 
HKPLWKIIPTTQPAE 


6900 


3 


451 


TEVLGSKGIHELRSSTSALHHALEESASLLTMFWRAALPSTHI P 
VLPGKVGESTERELLELRTKVSQQEQLLQSTTEHLKNANQQKES 
MEQF I VSQLTRTHDVLKKARTNLEVRKLLHQSEAP S LS PTKHH P 
LADLVGDSWPALRFQEK 


6901 


1 


201 


DDNMV QrLe TD FKMTL<XK}SrijEQ WAAWLDNVMMQALKP YEGRP ' ' 
SFPKAARQFLLKWSFYRYHLGFS 


6902 


I 2 


267 


GAPPPPPSQPPRQPPQAAPaSHPHSDLTFNPSSALSGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 


6903 


1 


149 


RINQVYRQGPTGIHIliVIDQMVQNFQDESCFLFSTVKAES^DGI 
HI ILK 


6904 


464 


2092 


MEASL P VSLS C VLACGD VEGKFDI LFNRVUAI Q KKSGN PDLLIjC 
VGNFFGS TQDAE WEE YKTG I KKAPIQT YVLGANNQETVKYFQDA 
DGCELAENITYLGRKGIFTGSSGLQIVYLSGTESLNEPVPGYSF 
S P KDVS S LRMMLCTTSQPKGVDI LLTS PWPKC VGNFGNSSGEVD 
TKKCGSALVSSLATGLKPRYHFAALEKTYYERLP YRNHI ILQES 
AQHATRF IALANVGNPEKKKYLYAFS I VPMKLMDAAELVKQPPD 
VTENPYRKSGQBASIGKQIIAPVEESACQFFPDLNEKQGRKRSS 
TGRDSKSSPHPKQPRKPPQPPGPCWFCLASPEVEKHIiWNIGra 
CYLAIAKGGLSDDHVLILPlGHYQSVVELSAEVVEEVEKYKArL 
RRFFKSRGKWCWFERNYKSHHLQLQVI PVPISCSTTDDIKDAF ' 
ITQAQEQQI ELLEI PEHSDI KQ IAQPGAAYPYVELDTGEKLFHR 

IKKNFPLQFGREVLASEAILNVPDKSDWRQCQISKEDEETLARR 
FRKDFE P YD FTLDD 


6905 
6906 


1 


226 


VSKTGEAETITSHYLFALGVYRTLYLFNV7IWRYHFEGFFDLIAI 
VAGLVQTVLYCDFFYLYXTKVLKGKKItSLPA 




3 


611 


SYDDHMGHIDFITAASNLRAKMYSIEPADRFKTKRtAGKlIPAI 
ATTTATVSGLVALEM IKVTGGYP FEAYKNWFLNLAIPI WFTET 
TE VRKTKI RNGI S FT I WDR WTVHGKEDFTLLDFINAVKEKYG T E 
PTMWQGVKMLYVPVMPGHAIO^LKLTMHKLVKPTTEKKYVDLl^ 
S FAPDIDGDEDLPGPPVRYYFSHDTD 


6907 


2 


2228 


bRGVPVWAAGAFRFSSGEESTSHLIMSRR^QRLTRYSQGDDDGS- 
oa»w>aavALi;>uto l lib iUJdPIiRTIjKRKSSNMKRIjSPAPQZjGPSS 
PAHTSYYSESIiVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 
GGSESSRASGLVGRKATEDFLGSSSGYSSEDDYVGYSDVDQQSS 
S SRLRS AVS RAGSLLWMVATS PGRLFRUjYWWAGTTWYRIjTTAA 
SLLDVFVLTRRFS SLKTFLW FLLPLLLLTCLTYGAW Y FYP YGLQ 
TFHPALVS WWAAKDSRRADEGWEARDSS PHFQAEQRVMSRVHSL 
ERRLEALAAE fssnwq KEAMRLERL ELRQGAPGQGGGGGLSHBD 
TLALLEGLVS RREAALKEDFRRE TAARIQEELSALRAEHQQDSE 
DLFKKIVRASQESBARIQQLKSEWQSMTQESFQESSVKELRRLE 
DQLAGLQQELAALALXQSS VAEEVGLLPQQ IQAVRDDVE3 QFPA 
WISQFLARGGGGRVGLLQREEMQAQLRELESKILTHVAEMQGKS 
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i SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co rresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A^Alanine, OCysteine, D=Aspartic Acid, 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
K^Histidine, I=Isoleucine, Ks Lysine, 
LsLeucine, M=Methionine, N«Aeparagine , 
paProline, Q=Glutamine, R=Arginine, 
S=Serine, TaThreonine , VsValine, 
WaTryptophan, Y« Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=spossible nucleotide insertion) 








AREAAASLSLTLQKEGVIGVTEEQVHHIVKQALQRYSEDRIGLA 
DYALESGGAS VISTRCSCTYETKTALLSLFGI PLWYHSQS PRVI 
LQPDVHPGNCWAFQGPQGFAWRLSARIRPTAVTLEHVPKALSP 

QAPTMATYQWELRI LTNWGH P EYT C I YRFR VHG E PAH 


6908 


3 


780 


QVPSAAWIiMAVCGLGSRLGLGSRLGLQGCFGAARLLYPRFQSRG 
PQGVEDGDRPQPSSKTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDEL SSAI G FALELVTE 2CGHTFAEELQKI Q CTLQDV 
GSALATPCSSAREAHLKYTTFKAGPIIjELEQWIDKYTSQLPPLT 
AFI LPSGGK I S SALH FCRAVCRRAERRWPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAM KEGNQEKI YKKNDPSAESEGL 


$909 


3 


409 


GRLLAVGTDLYGQRSSAPEQELLVQDATPVSUSLLPEKAFSDIP 
SP YLRGTI KMMQAVRQA FQDQDDRRTWDGRPLTMAATFDDCLYA 
LCVVDTIKRSSQTGEWQWIAIMTEBPELSEAYLISEAMRRSRMS 
LYC 


6910 


1 


1068 

* 


LVPVWIDSYYYGKLVIAPLNIVLYNIFTPHGPDLYGTEPWYFY 
LI NG FLNFNVAFALALLVLPLTS LME YLLQRFHVQNLGHP YWLT 
LAP M Y I WF 1 1 FFI QPHKE BRFLFP VYPLI CLCGAVALS ALQHS F 
LYFQKCYHFVFQRYRLEOTTVTSNWLALGTVFLFGLLSFSRSVA 
LFRGYHGPLDLYPEFYRIATDPTIHTVPEGRPVNVCVGKEWYRF 
PSSFLLPDNWQLQFIPSEFRGQLPXPFAEGPLATRIVPTDMNDQ 
NLE E PSRY I D I S KCH YLVDLDTM RETPREPKYS SN KEEW I SLAY 
KViruui^KiiKjjJbKAr X Vf cJj£>Uu x 1 ViVNil IitKPRKAKQXRK 
KSGG 


6911 


1184 


966 


GEDAEEMETGNVANLIS I FGSSFSGLLRKSPGGGREBEEGEESG - 
PEAAEPGQICCDKPVLRDMNPWSTAIVAF 


6912 


1 


844 


AMKP VETHSFQMLFT ILSTGSALKAQS YEDAYRCI KSS ILLGS I 
ovjVj lUils UrWGnN C SLr Vl Man I (JARN lAaTlA V hIA WN LEG XA vW 
GBSGEL VCTKP I PCQ PTH FWNDENGNKYRKAYFS KFPG I WAHG D 
YCR INPKTGGI VMLGRSDGTLNPNGVRFGSSE I YNIVES FEEVE 

S ARHVPS L ILETKG I P YTLNGKKVEVAVKQI IAGKAVEQGGAFS 
NPETLDLYRD I PELQGF 


6913 


1643 


. 1S58 


KKSHEESHKEELSYGADASLPLPGSDFR 


6914 


1251 


615 


ELAAECKSAGYPGTLIPYRCDLSNEEDILSMFSAIRSQHSGVDI 
CI NNAGLAR PDTLLSGSTSGWKDMFNVNVLALS ICTREAYQSMK 
ERNVDDGHIININSMSGHRVLPLSVTHFYSATKYAVTALTBGLR 
QELREAQTHIRATCISPGWETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAVIYVLSTPAHIQIGDIQMRPTEQVT 


6915 


254 


652 


GRSLS FKTFLI WVL1S I YQGG ILMYGALVLFESEF VHWA3 SFT 
ALILTELLMVALTVRTWHWLMVVAEFLSLGCYVSSLAFLNEYFD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLS PPSYCKLAS 


6916 


254 


652 


GRSLSFKTFLIMVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTEwLMVALT VRTWHWLMWABFLSLGCYVS S LAFLNE YFD 
VAFITTVTFLWKySAJTWSCLPLYVLKYLRRKLSPPSYCKLAS 


6917 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTELLMVALTVRTWHWLMWAEFLSLGCYVSSLAFLNEYFD 
VAF1TTVTFLWKVSAITWSCLPLYVLKYLRRKLS PPSYCKLAS 


6918 


28 


921 


PEAGTRSWRE PD P EDLRRFLLS AACRS FPQ WLPGGGGGQ VSS CS 
DTDVP YLLLAVKS EPGRFAERQAVRETWGS PAPG 1 RLLFLLGS P 
VGEAGPDLDS L VAWE S RRYS DLLLWD FLDVPFNQTLKDLLLLAW 
IjGRHCPTVSFVLRAQDDAFVTiTPAIJjAHI^tALPPASARSLYLGE 
VFTQAMPLRKPGGPFYVPESFFEGGYPAYASGGGYVIAGRLAPW 
LLRAAARVAP FPFEDVYTG L CXRALGLVPQAHPG FLTAW PADRT 
ADHCAFRNLLLVR P LGPQAS IRLWKQLQDPRLQC 
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ID 
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location 
corresponding 
to first 
aroino acid 
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amino acid 
sequence 


Predicted end 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A= Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H»Histidine, I-Isoleucine, K=Lysine, 
Lsfceucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R«Arginine, 
S=Serine, T«Threonine, V-Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 


6919 


850 


41 


QGRRELSGSVFCPFIQQBPKEMLTIjSEYHERVRSQGQQLQQLOA " 
ELDKLHKEVSTVRAANSERVAKLVFQRLNEDFVRKPDYALSSVG 

as idlqktshdyadrntayfwnrfsfwnyarpptvi lephvfpg 
ncwafegdqgqwiqlpgrvqlsditlqhpppsvehtggansap 

RDFAVPFLLSFFTHQGLQVYDETEVSLGKFTFDVEKSEIQTFHL 

qndppaafpkvkiqilsnwghprftclyrvrahgvrtsegaegs 

AQGPH 


6920 


1418 


591 


EAQGPSKVHMTjKKKK ~ — 


6921 


2 


1711 


MWATi^EEQFHVINHAEQTLUKMl^iLltfiKQLCDVLLlAGHLRI 
PAHRLVI^AVSDYPRAMPTMH'T/T.'Pa VOPWrrDMirr'tmTikTJi t xh--y tr 

QYAYTGVLQL3CEDTIESLLAAACLLQLTQVIDVCSNFL1KQLHP 
SNCLGIRSFGPAQGCTELLNVAHKYTMEHFIEVIKNQEFLLLPA 
NE1SKLLCSDDINVPDEETIFHALMQWVGHDVQNRQGELGMLLS 
YIRLPLLPPQUADLETSSMFTGDLECQKLLMEAMKYHLLPERR 
SMMQS PRTKP RKST VGAL YAVGGMDAMKGTTT I E KYDLRTNSWI* 
i riiM\jixi\Lt\4e u? VAV X L»iv ivLi i V VooKXX^ijjCI I^NTvECFNP VGK 
I WT VMPPMSTHRHGLGVATLEGPM YAVGGHDG WS YLNTVERWDP 
EGRQMNYVASMSTPRSTVGWALNNKLYAIGGRDGSSCLKSMEY 
FDPH1NKWSLCAPMS KRRGGVG VAT YNG FLY WGGHD APAS NHC 
SRLSDCVER YDPKGDS WST^APLS VPRDAVAVC ?LGD KLYVVGG 
YDGHTYLNTVESYDAQRNEWKEBVPVNIGRAGACVVVVKIiP 


6922 


107S 


369 


LTPPAG1RHEVRDRBREREREREREKFPLDSTGSELKQNIHSIT 
wur rr%i | U IV v n •* iv.vaxiMAro.UJv. x x»K£. X 1\.V X olaAK I MGGGS T IND VLA 
Vm'PKDAAG^DAKAEENKKEPLCRQKQHRKVLDKGKPEDVMPSV 
KGAQERLPTVPLSGMYNKSGGKVRLTFKLEQDQLWIGTKERTEK 
LPMGSIKNVVSEPIEGHEDYHMMAFQLGPTEASYYWVYWPTQY 
VDA1 KDTVLGKWQYF 


6923 


2469 


1660 


LiGIiFCILPlDTIiCAVLERDTLS IRE SRLFGAWRW AEAE CQ RQQ " 
LPVT FGNKQ KVLGKALSLIRF PLMT I E EFAAG P AQS GI LS DREV 

VT^FLHFTVNPiCPRVEYIDRPRCCLRGKECCrNRFQQVESRWGY 
SGTSDRIRFTVlJRRISTVGFGT,vnQT'Hf20TnvnxrKTTr\T t r> vr w 

G^GQNDTGFSCDGTANTFRVMFKB P IEILPNVC YTACATLKGP 
DSHYGTKGLKKVVHETPAASKTVFFFFSSPGNNNGTSIEDGQIP 
EIIFYT 


6924 


2210 


1235 


PEBRVICFVEYYLTAFHEGRKGAIiAKKPYNPI IGETFHCSWEVP ' 
KDR V K P KRTAS RS PAS CHE 1 1 PMADD P SKS YKLR FVAEQ VSHHP P 
ISCFYCECEEKRLCVNTHVWTKS KFMGMSVGVSMIGEGVLRLLE 
HGEEYVFTLPS AYARS I LTIPW VELGGKVS INCAKTG YSATVI F 
HTKPFYGGKVHRVrAEVKHNPTNTIVCKAHGEWNGTLEFTYNNG 
BTKVIj^TTTLPVYPKKIRPLEKOGPMESRNLWREVTRYLRLGDr 
DAATEQKRHLEBKQRVEERKRENLRTPWKPKYFIQEGDGSGILQ 
SPLESTLMGLEVQSFPV 


6925 
6926 


2 

1 " 


1653 
733 


RGCW^AAMBPDSVIEDKTIELMCSVPRSLWLGCANtVESMCAxi"" 
SCLQSMPSVRCLQISNGTS S VI VSRKRPSEGNYQKEKDLCI KYF 
DQWSES DQVEFVEHLI SRMCHYQHGH INS YLKPMLQRDFI TALP 
EQGLDHIAENILSYLDARSLGAAELVCKEWQRVIS EGMLWKKLx 
ERMVRTDPLWKGLSERRGWDQYLFKNRPTDGPPNS FYRSLYPKI 
IQDIETIESNWRCGRHNLQRIQCRSENSKGfVYCLQYDDEKI ISG 
LRDNS I KIWDKTS LECLKVLTGHTGSVLCLQYDERVI VTGSSDS 
TVR VWDVJTTCEVL^L IHHNEAVLH L R FSNGLMVT CS KDRS IAV 
WDMAS ATDITLRRVLVGHRAAVNVVDFDDKY I VSASGDRTI KVW 
STSTCEFVRTLMGHKRGIACLQYRDRLVVSGSSDNTIRLWDIEC 
GACLRVLEGHEELVRC IRFDNKRI VSGAYDGKI KVWDLOAALD P 
RAPASTLCIiRTLVEHSGRVFRLQFDBFQI ISSSHDDTILIWDFL 
NVPPSAQNETRS PSRTYTYISR 

SGRVAMDGiySLQFPEG^FPAGPPJJjPPHMGGHYRDCQSIiGAPP]!, 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
{A-Alanine, C=Cysteine, D=Aspartic Acid, £= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
s=Serine, ToThreonine, V=»Valine, 
^-Tryptophan, Y=Tyrosine, X«Uhknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opoasible nucleotide insertion) 








DGyPLPTPDTSPLDGVDPDPAFFAAPMPGDCPAAGTYSYAQVSD 
YAGP PEP PAG PMHPRLG PEPAG PS I PGLLAP PSALHVYYGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPBALPCRDGT 
DPSQPAELLGKVDRTEFEQYLHPVCKPEMGLPYQGHDSGVNLPD 
SKGAX SSWS DASSAVY YCNY PDV 


6927 


2 


1484 


LTLCGDIQLMLAQNANNRAAHLEEFHyQTypnnPTT hot tip ceo — 
OQG FAWATDLS TDLESQhS VS C2CCYEAANB ILQFRJDLJCSQNPEH 
YVQVLKRMGN IRNE I GVFYMNQAAALQSERLVS KS VSAAEQQLW 
KKSFSCFEKGlHNFESIEDATNAALT.T.r'MTmjT MDTP»n^u/v<ii 

GDBLK^EFSPEEGLYYNKAIDYYLKALRSLGTRDIHPAVWDSVN 
WELSTTYFTMATLQQDYAPLSRKAQEQIEKEVSEAMMKSLKYCD 
VDSVSARQPLCQYRAATIHHRLASMYHSCLRNQVGDEHLRKQHR 
VLADLHYS KAAKLFQLLKDAPCBLLRVQLERVAFAEFQMTSQNS 
NVGKIiKTLSGALD IM VRTEHAFQI> IQKEfc IEE FGQPKS GDAAAA 
ADASPSLNREEVMKLLSIFESRLSFLLLQSIKLLSSTKKKTSNN 
IEDDTILKTWKH I YSQL LRATANKTATLLERI NVIVHLLGQLAA 
GSAASSNAVQ 


6928 


1006 


777 


EAJDLlNNLIiQViCMRKRYSVDKTr <zjjvwt nnvnnrr nr ppt bow * 
*^ » v iiuvim^ » wivAljDflr'n.LiyL/i y I WLiDLREIiECK 

IGERYITHESDDLRWEKYAGEQGLQYPTHLINPSASHSDTPETE 
ETEM KALGER VS I L 


6929 


1749 


607 


RDQRGYRDDRSPAREPGDVSARTHSGGGGGRSATTAMPPPVPNG 
ywunuuuH v w v/iLfKt'dL.oKwc'KKA.lyKPQPAGGRRSG 

RGPAAGGLCLQPPDGGTCVPEEPPVPPMDWEALEKHLAGLQFRE 
QEVRNQGQARTNSTSAQKNERESIRQKLALGSFFDDGPGIYTSC 
SKSGKPSLSSRLQSGMNLQI CFVNDSGSDKDSDADDSKTBTSLD 
TPIiS PMSKQS S S YSDRDTTE E ESESLDDMDFLTRQXKLQAEAKM 
ALAMAKPMAKMQVEVEKQNRXKSPVADIAPHMPHT qptt mvt>«t 
KPTDLPJ>MTIGQLQVTVNDLHSQIBSIjNEBLVQLLI,IRDELHTE 

qdamlvdi edltrhaesqqkhmab kmpak 


6930 
" 6931 


131 


545 


fkdtanvfvslfqmri^frhyfibPsqlklfydvitwi'vtqvai 

S YTWP FVLLSI KPSLTFYSSWYYCUIILGILVLLLLPVKKTQR 




2 


659 


PVERLPNRPACLLVASGAAEG VS AQS FLHCFTMAS TAFflLQVAT 
PGGKAME FVDVTE SNARWVQDFRLKAYAS PAKLES I DGAR YHAL 
LI PSCPGALTDLASSGSLARI lqhfhseskpicavghgvaalcc 

atnedrswvfdsysltgpsvcelvrapgfarlplwedfvkdsg 
acfsasepdavhvvldrhlvtgqnasstvpavqnllflcgsrk 


6932 


2 


1131 


FVDSPGQGEQAEEfiBGGIQMNSRMRAHSPABGASVBSSSPGPKK 
SDMCEGCRSLAAGHPGYISHDKETSIICYVSHQHPSHPQLFSIVR 
<^CVRSLSCEVCPGR2GPIFFGDEQHGFVFSHTFFIKDSLARGF 
QRWYSII TIMMDRI YLINS WPFLliGKVRGI IDELyGXALKVFEA 
EQFGCPQRAQRMNTAFTPFLHQRNGNAARSLTSLTSDDNLWACL 
HTSFAl^LKACGSRLTEKLLEGAPTEDTLVQMEKLADLEEESES 
WDNSEAEEEEKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
uf vr AbljKrtWKO vGGRGTAHHELRRRANHGLCLPTRLASGPSTI* 
KTLQEVTDSLLGG WLMAQGVGGI I 


6933 


1431 


890 


SLNLHCTLPPFPHQYPAGYPSDKEGKKPKGwSKXQPSGTTKRPI 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHLIRBDCQNQKLW 
DE VLSHLVEGPNFLKKLEQS FMCVCXTQELVYQP VTTE CFHNVCK 
DCLORS FKAQVFS CP ACRHDLGQNY1 MI PNE ILQTLLDLFFPGY 
SKGR 


6934 


30lQ 


2588 

1 


drdhsq<^irrvaij«vssvkliskakirtvkmtfiiv!afiv 

CWTPFFFVQMWSVWDANAPKBASAFI IVMLLASLNSCQTPWIYM 

LFTGHLFHELVQRFLCCSA3 YLKGRRLGETSASKKSNSSS FVLS 
KRSSSQRSCSQPSTA 
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ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
eaguence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cyateine, DsAspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HsHistidine, Ielsoleucine, K=»Lysine, 
L*Leucine, M»Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
WsTryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=*possible nucleotide insertion) 


£935 


886 


S43 


NSALYVAOQ^DGTSCLNSVBRYSPKAGAWESVAPMNlRRSTHDb 
VAMDG WLYAVGGNDGS S S LNS IEKYNPRTNKWVAASCMFTRRSS 
VGVAVLELLNFP PPSS PTLSVSSTSL 


6936 


1347 


567 


RS H RRQFLS RALLEFFGKSHP P PHRLPRKS LNVGLHYSHI P FLT 
TCLHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
MEKRLQEAQLYKEEGNQRYREGKYRDAVSRYHRALLQLRGIiDPS 
LPSPLPNLGPO^PALTPEQENILHTTQTDCYKNLAACLLQMBPV 
NYERVREYSQKVtiERQPDNAKALYRAGVAFFHLQDYDQARHYIil* 
AAVNRQPKDANVRRYLQLTQSELSSYHRKEKQLYLGMFG 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCFDY 
DRACPAR PCFVGEWS PWSGCADQCKPTTRVRRRSVQQEPQNGGA 
PC?PLEERAGCIEYSTPQGQDCGHTYVPAFITTSAFNKERTRQA 
TS?HWSTHT2DAGYC^fEFKTBSLTPHCALENRPIlTRWMQYLREG 
YTVCVDCQPPAMNSVSLRCSGDGLDSDGNQTLHWQAIGNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


719 


NSRKLEIJ^RVDTDFMQLKKRRQSSEKENDSGTLDTVGAVVVDH 
EGNVAAAVSSGGLALKHPGRVGQAALYGCGCWABNTGAHNPYST 
AVSTSCCGEHLVRT 1 LARE CS HALQ AEDAHQALLETM QNKF IS S 
P FLASEDGVLGGVI VLRSCRCSAE PDSSQNKQTLLVEFLWSHTT 
BSMCVGYMS AQDG KAKTHI SRLPP G AVAGQSVAI EGGVCRLGE P 
SELTLQAECEASQRHFRT 


6939 


3 


810 


KVTAPRRPQRYSSGHGSDNSSVLSGELPPAMGRTALFHHSGGSS 
GYESLRRDSEATG3ASSAPDSMSESGAASPGARTRSLKSPKKRA 
TGLQRRRLI PAPLPDTTALGRXPS LPGQWVDLPPPLAGSLKEPF 
EIKVYEI DDVEPXQR PRPTPREAPTQGLACVSTRLRLAERRQQR 
LREVQAKHKHLCEELAETQGRLMLEPGRMLEQFEVDPBLEPESA 
EYLAALEPJITAALEQCVNLCKAHVMMVTCFDISVAASAAIPGPQ 
EVDV 


6940 


1188 


496 


GKMAAQPLRHRSRCATPPRGDFCGGTERAIDQASFTTSMEWDTQ 
WKGS SPLG PAGLGAEEPAAGP QL PSWLQPERCAVFQCAQCHAV 
LADS VHLAWDLSRSLGAWFSRVTNNWLEAPFLVGI EGSLKGS 
TYNLLFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKArVNASEMDIQNVPLSEKIAELXEKlVLTHNRLKSLMKILS 
EVTPDQSKPEW 


6941 


1 


713 


SLS PADS DPHGP HTCGHVLNVI IGSNVLAiAEAQRQAEALGYQA 
VVLSAAKQGDVKSMAQFYGLLAHVARTRLTPSMAGASVEEDAQI, 
HELAAELQ I PDLQLEEALETMAWGRGP VCLLAGGEPTVQLQGSG 
RGGRNQE LALRVG AELRRW PLG P I DVLFLSGGTDGQDGPT3AAG 
A5fVT?ELASQAAAEGLDIATFLAHNDSHTFPCCLQGGAHLLHTG 
MTGTNVMDTHLLFLRPR 


6942 


1 


246 


GDYVERyDPKTDTWTMGAPtiSMPTNAVGGCIiLGDRLYADGGYDG 
QTYLNTMESYDPQTNBWTQMASLNIGRAGACVVVIKQP 


6943 


1 


739 


PNATGDGAKTLAIHVKALTADS I R 1 TWKATLPAS S FRLS WLRLG 
HS PAGGS ITETLVQGDKTEYLLTALEPKPTYI I CMVTMETTNAY 
VADETPVCAKAETADS YGPTTTLNQEQNAGPMASLPLAGI IGGA 
VALVFLFLVI^AICWYVHQAGELLTRERAYNRGSRKKDDYMESG 
TKKDNS ILEIRGPGLQMLPINPYRAKEEYVVHTIFPSKGSSLCK 
ATHTIGYGTTRGYRDGGIPDIDYSYT 


6944 


960 ' 


156 


VANI LLNGVKYESELTGSSERAEQPLSVGRLCSTI CNM PKALRT 
LCVNHFLGWLSFEGMLLFYTDFMGEVVFQGDPKAPHTSEAYOKY 
NSGVTMGCWGMCI YAFS AAFYS A I LEKLEEFLS VRTLYF I AYLA 
FGLGTGLATLSRNLYWLSLCITYGILFSTLCTLPYSLLCDYYQ 
S KKFAGSS ADGTRRGMGVD I SLLS CQ YFLAQ I LVSLVLG PLTSA 
VGSANGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADBBHRPIiL 
LNV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
c or re sp ondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acad segment containing signal pepti<5e~" 
{Alanine , C=Cysteine, Dispart ic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Hist idine, I=Isoleucine, K=*Lysirie, 
L=Iieucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V* Valine, 
WaTryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 


6945 


2067 


179 


egedrglprtmgaawtgtrlapwpgracgalprwtptapacJgc 

HS K PGP AR PVP L KKRG YD VTRNPHLNKGMAFTLEB RLQ LGI HGL 
IPPCFLSQDVQLLRIMRYYERQQSDLDKYI I LMTLQDRNEKLFY 
RVLTSDVEKFMP I VYT PTVGLACQHYGLTFRR PRGLFI T IHDKG 
Hl»ATMLNSWPBDNIKAVVVTDGERlLGI/3DI/3CrYGMGIPVGKIiA 
LYTACGGVNP QQCLP VLLDVGTNNE BLLRDPLYI GLKHQRVHGK 
AYDDLLDEFMQAVTDKFG I NCLIQFBDFANANAFRLLNKYRNK Y 
CM FNDD I QGTASVAVAG ILAALR I T KN KLSNHVFGFQG AGE AAM 
G\IAHLLVMALB\KEGVPKA\EATRKIW\MVDF\KGLIVQGRDH 
LNHE KEMFAQD \HPE VNS LEE WRL VKPTAI I G VAA IAEA\ FTE 
Q1LRDMASFHERP\IIFALSNPTSKAECTA\EKCYRVTEGPRGF 
FAS\GSPF*GVLIWEMGKTFIPGGRGNNA*RVPRGWQLGVHSPG 
GDPGH I P\DE IFL PDSRAKL PQEVS BQHLSQGRL YP\ PLS T \ IR 
NVFLRIAIKVFD*GYKHNLV\SYYPEPKD\KEAFCKIPGSYTPD 
YDS FYT/VDSYI WAQGKAMNVQTV 


6946 


133 


2551 


SCEYSGI TVA PGDPCPG VAHLLAPSMASDTPES LMALCTD FCIiR 
NLDGTLGYLLDK2TLRLHPDIFLPSEI\CDRLVNEYVBLVNAAC 
NF\EPHE\SFFNPLFRDPRKQPASRRJHL\RBD\LVQD\QD\LE 
AIRKQDL\VEL\ YLTN\CEBCLSAKSLQTLRSFSHliiGVP *AFFG 
CSTWILLLRKENPGGL/CEDEYLFMPTCQVLVKDFTFEGFSRLR 
F\ LKLGRMID WVPVES \LLRPLNSLAALDLSG IQTSDAA\ FLTQ 
WKDSL\VSLVL\ YNMDLSDD! I IR \ VI VQLHKLRHLDI SRDRL3S 
YYKFKLTRE VLS LFVQKLGNLMSLD I S G \ HMTTiFWrQ Tav-rrv-o 

EAGQTSI\EPSKVSSIIPFRGFEGGPLQF\LGVF*GIFCGRLTH 
I PAY KVSGDKNE EQVLNAI E A YTEHR P E ITSRAINLL FDIAR IE 
RCNQLLRALKLVITALKCHKYDRWIQVTGSAALFYLTNSBYRSE 
QSVKLRRQVIQVVLNGMESYQEVTVORWCCLTLCWFSIPBELEF 
QYRRVNELIiSILNPTRQDESIQRlAVHLCNALVCQVDNDHKEA 
VGKMG F VVTMLKLTQKKLLDKTCDQ VMEFS W\ SALWNITDETPD 
NCEMFLNFNGMKLFI^CI^EFPBKQEljHRNMLGLLGNVAEVKEL 
RPQLMTSQFISVFSNLLBSKADGIEVSYNACGVLSHIMFDGPEA 
WG VCE PQREEVE ERMWAAI QS WD INSRRNINYRS FEPI LRLLPQ 
GISP VSQHWATWALYNJbVS VY PDKYCPLL I KEGGMPLLRDI I KM 
ATARQETKEMARKVIEHCSWFKEENMDTSR 


6947 


2 


1682 


TSVSTIPRGliASARPQSRSWRCCPVWRRSPGRARGRGLKMLNVP 
SQSFPAPRSQQRVASGGRSKVPLKQGRSLMDWIRI,TKSGKDLTG 
LKGRLIEVTEEELKKHNKKDDCWICIRGFVYNVSPYMEYHPGGE 
DELMRAAGSDGTELFDQVRRWVNYESMJbKECLVGRMAIKPAVLK 
DYREEEKKVLNGMbPKSQVTDTLAKEGPSYPSYDWFQTDSLVTI 
/EHIY*TEGYQFRLNNS*SSE*FLYSRNNY*GLLISYTYW/R*A 
MRFRKIFIiCGL/CESVGKIEIVLQKKENTSWDFLGHPLKNHNSL 
I PR KDTGLYYRXCQL I SKEDVTHDTRLFCLMIJP PS THLQ VP IGQ 
HVYLKLPITGTEIVKPYTPVSGSLLSEFKEPVLPNWKYIYFLIK 
IYPTGLFTPELDRLQIGDFVSVeSPEGNFKISKFQELEDLFLLA 
AGTGFTPM VKILNYAJLTD I PSLRKVKLMFFNKTEDDI I WRSQLE 
rjjrvc rojiu^LUJ vcrv uofie x. ;> Jt. V JMtiKQGH I S PALLSE FLKRNLDK 
SKVLVCICGPVPFTEQGVRLLHDLNFSKNEIHSFTA 


6948 
6949 


104 


58 


PDGAHSFFPDEYPTCSSLCI^CGVGCKKSMNHGKEGVPriEAKSR 

cr yshq ydnr vytckacyergeevs vvpktsastds p wmgiaky 
awsgyviecpncgvvyrsrqywfgnqdpvdtvvrtre i vhvwpgt 
dgfiikdnnnaaqrlldgmnfmaqsvs blslgptkavts wltdq i 
APAYWRPNSQIIjS cnkcats fkdndtkhhcraogeg fcds CSS k 

TRP VPERGWGPAPVRVCDNCYEAR/ TRPVS CYRGTS GR * RRRRT 




1S2 


4656 


SI^CLSRPLTRPG'DDS^GSAMASGAGGVGGGGGGKIRTRRCH 
3GPIKPYQQGR<^HQGII^RVTESVKNIVPGWLQRYFNKNEDVC 
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j SEQ 
ID 
I NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A-Alanine, C-Cysteine, D=Asparcic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L*Leucine, M^Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V= Valine, 
WaTryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
_\»poegible nucleotide insertion) 


6950 






5* CSTDTS EVP RVJPENKEDHLVYADEES SNITDGR I TPEPA V&NT 

BEPSTTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 

SAFPIGSSGFSLVKEIKDSTSQHDDDNISTTSGPSSRASDKDIT 

VSKNTSLPPLWSPEAERSHSLSQHTATSSKKPAFNLSAFGTLSP 

SLGNSS nJCTSQLGDSPFYPGKTTYGGAAAAVRQSKLRNTPYQA 

P VRRQMKAKQLSAQS YG VTSS TARR Z UQSLEKMSS PLADAXR I P 

S I VS S PLNS P LDRSG ID I TDFQAKREKVDSQ YP P VQRLMT PKPV 

SIATNRSVYPKPSLTPSGBPRKTNQRIDKKCSTGYEKNMTPGQN 

REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 

LEBEEMEGPVLPKlSLPITSSSn,PTFNFSSPEITTSSPSPINSS 

QALTWKVQMTSPSSTGSPMFKFSSPIVKSTEANVLPPSSIGFTF 

SVPVAKTAELSGSSSTLEPIISSSAHHVTTVNSTNCKKTPPEDC 

EGPFRPAEILKBGSVLDILKSPGPASPKIDSVAAQPTATSPWY 

TR PAISSFSSSG IGFGESLKAGSS WQCDTCt LQNKVTDNKCI AC 

QAAKLSPRDTAKQTGIETPNKSGKTTLSASGTGFGDKFKPVIGT 

WDCDTCLVQNKPEAIKCVACETPKPGTCVKRALTLTVVSESAET 

MTASSSSCTVTTGTLGFGDKFKRPIGSWECSVCCVSNNAEDNKC 

VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 

ELCXVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS 

S FKFG VSS SSSGPS QTLTS TGNFK PGDQGGFKI GVSSDSG Y I NP 

MSEGF*FSKHIVGFKFGVSSESKPBEVJCKDSKNDNFKFGLSFGL 
SNPVFLTPFQFG VSNIiGOEE KKEELL KS q narjpp T?r »rr-t r t xt<? w 
VPANT I VTS ENKS S FNLGT I ETKS VSVAPLKCQTS EAKKEEM PA 
TKGGF5FGNVEPASLPSASVFVLGRTEEKQQEPVTSTSLVFGEG 
KriTMKEPKC\QPVFSFGEFQRQTKDENSSKSTFSFSMTKPSEKE 
SEQPAKATFAFGAQTNTTADQGAAKPDIiS YLNNSSSS SS TPATS 
AGGG \ I FG SSTSS S WPP VATFVFGQS SN PGSS £ \AFGNTAES S T 
S QSLLFS QDSKLATTS S TGTAVTPFVFG PGASENNTTTS SFG FG 
ATTTSSSAGSSFVFGTGPSAPSASPAFGAUQTPTFGQSQGASQo 
NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGOQPSQ 
SAFGSGTTPNSSS AFQFGSSTTNFNFTNWS PSGVFTFGANS STP 

AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTSFSGRKIK 
TAVRRRX 


6951 


2585 
1940 


411 


PRFGSRSGLCRRAGERGAVRAGGLSRRTRAE , *IMDELHYQDTDS 
DVPEQRDSKCKVKWTHEEDBQLRALVRQFGQQDWKFLASHFPNR 
TDO^CQYRWLRVLNPDLVKGPWTKBEDQKVIELVKKyGTKQWTL 
IAKHLKGRLGKQCRERWHNHLNPEVKKSCWTEEEDRI 1CEAHKV 
LGNRWAEIAKMLPGRTDNAVKNHWNSTI KRKVDTGGFLSESKDC 
KPPVYLLLEIiEDKDGLQSAQPTBGQGSIiLTNWPSVPPTI KEEEN 
SEEELAAATTSKBQEPIGTDLDAVRTPBPLEEFPKREDQEGSPP 
ETSIiP YKWWEAANLLI PAVGSS LSEALDL 1 ESDPDAWCDLS K F 
DL?EEPSAEDSINNSLVQLQASHOQQVLPPRQPSA\LVPSVTBY 
RLDGHTISDLSRSSRGELIPISPSTEVGGSGIGTPPSVLKRQRK 
RRVALSPVTENSTSLSFLDSCNSLTPKSTPVKTLPFSPSQFLNF 
WNKQDTI^LBSPSLTSTPVCSQKVVTTTPLHRDKrPLHQKHAAF 
WiX x anuN x rni v I ±*t ftNAI»E Kr GPLKPLPQTPHLEEDLKE 
VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSiALDIV 
DEDMKLMMS TLPKS LS LPTTAPSNSS SLTLSG I KEDNS L LNQGF 

iKJAKPEJC^VAQKPRSHFTrPAPMSSAWKTVACGGTRDQLFMQE 
KARQLIX3RLKPSHTSRTLILS 






239 

« 

1 
1 


AGPDDTMKRS LiQAIi Y CQLIiS FLL I LALTBAliAFA IQE PS PRE S L 
aVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 
1TTPRAEGHPPT\TPSPPSLRQ*PPPII I KAP/SSTGPAPAAMAT 
rSS KPEGR PRGQAAPT I LZjTKP PGATSRPTTAP PRTTTRRP PRP 

PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 
JsKI FQI YKGNFTGS VE PEP S TLTPRTP1MGYSSS PQPQTVAAT 
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SEQ 
ID 
NO: 



Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



I Ammo acid segment containing signal peptide" 
<A*Alanine, Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V«Valine, 

| W=Tryptophan, Y-Tyrosine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 



658 



TVPSNTSWAPTiTSLGPAKDKP GLRHAAQGGGSTFTSQGGTPDA " 
TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP* *LLAYCYP \CT 
S RPLSTS SGVFTAATGPTPAAFDTSVSAPSQGI PQGAS TTPQAP 
THPSRVSESTISGAKEETVA\PSP*PTGCPVLSPQWYPQPOAIS 
STAWSPPGPGSLGQQGTSPMWPRGTNRSTBPPSA*ARWISPG*S 
WPSACPSPP\LCPADGVLHEEBEEDRQPGEQPEAYGNNTHHPGT 
TFQQAC \RGAAPGEIP VPLKPLRTQIiSEPRS PANGD YRDTGMVP 
C 



304 



6953 



1512 



-349~ 



6954 



"SIS' 



PESEGESGEMTDRYT lHSgLBHUjSK.yiG'AATPTPPSGSG\CE 
PTPRLVLLLHGPLRPSQLLRHCGE * EQSASPIiQLDGKDASALWT 
AS RQARGE LR LCLTTAVRGT S PS VS P VCQSS 
NWCjKTRAIASGKHVPFGKQT NPNKS/ VHCDS *G» *RRETTQDES 
TS PHFRGKMGGW\ KLEKELENTBQ P VGGNEG * EHEVTGWLNSD 
PLLELCQCPLCQLDCGSREQLIAHVYOHTAAWSAKSYM\CPVC 
GRALSS PGSLGRHLL IHS EDQRSNCAVCG AR FTS HATFNSEKLP 

EVLNMESLPTVHNEGPSSAEGKDIAPSPPVYPAGILLVCNNCAA 
YRKLLEAQTPSVRKWALRRQMEPLEVRLQRLERERTAKKSRRDN 
ETPEERSVRRMRDREAKRI^RMQETDEQRARRLQRDREAMRLKR 
AIETPEKRQARLIREREAKRLKRRLEKMDMMLRAQFGQDPSAMA 
ALAAEMNFFQLPVSGVELDSQLLQKMAFEEQ NSSSLH 
PPPP F 1 1 PSh FREAGT *AG * KRSGDS3CS P P VEQ * A* TRAAAQN" 



"6955- 



1968 



782 



7 " * ~""v»n».i, -^»va - n^ov3i7C>AL£3fc'FVEQ*A*TRAAAQN 

* PQR * RWTEGNSPQAS AVATPGQGAS PAAPRCTP * PSRRHRRLP 
PGARP PAG* AAPAPT KPVJIiAG PASAPQPGAAPLS P PAPPL IRTR 

* CAGAAARGR PRRDRS PRPRTPGG CS WSEPRTP PAVSASAQTPS 
DAG*AGGR*GQRQRPSTGR* PPGVGGAGRSHRREGT IPGNPHPR 
^*^WQR*PGP/REWGI,*EPQGEEMSGPGGPGGAPPNQVGSS 

FPGRRQ VRAQ VAG AP VGHW GTRARQ VKTGGRRRARRTM PF ^ 
! WRS PG WS W I KTEDGWKRCES CS QKLER ENNHCNI SHS I ILNS ED 
GBIFNNEEHBYASKKRKKDHFRNDTNTQSFYREKW1YVHKESTK 
ERHGYCTLGEAFN RLDF S S AIQD IRRFNYWKLLQL I AKSQLTS 
hSGVAQKN YFNILDKI VQJCVLDDHHNPRLI KDLLQDLSSTLCIL 

/n*rsrevcisgkhqyldlpirmysrlattatgssdd*ase\ng 

LTLS DLPLHMLNNI LYR FSDG WDI ITLG QVTPTLYMLSEDRQLW 
KKLCQYHFAEKQFCRHLILSBKGHIEWKLMYFALQKHYPAKBQY 
! GDTLHFCRFf CS ILFWKDSGHPCTAADPDS CFTPVS PQHFI DLFK 
F 



QTSTSI FASPTSPPVLGESVLQ ^^FDLNNGSDAEQEEMBTQSS ' 
DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 
PErSPEVCPAASTWSPAVFSWSPASSA^/LPAVSLEVPLTASV 
TS P KAS P VTSPAAAFPTAS PANKD VSS FLETTADVEE ITGEGLT 

ASGSGDVMRRR1ATPEEVRLPLQHGWRREVRI KKGSHRWQGETW 

YYGPCGKRMKQFPEVIKYLSRNWHSVRREHFSFSPRMPVGDFF 

EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 

P KVKRGRGRP PKVKI TELLNKTDNR PLKKLEAQETLNEEDKAKI 

AKSKKKMRQKVQRGECQTTIQGQARNKRKQETKSLKQKEAKKKS 

KAEKEKGKTKQEKGKEKVKRBKXEKVKMKEKEEVTKAKPACKAD 

XTLATQRRLEERQRQQM ILEEMKKP TEDMCLTDHQPLPDFSRVP 

GLTLPSGAFSDCLTIVEFLHSK3KVLGFDPAKDVPSLGVLQEGL 

LCQGDSLGEVQDLLVRLLKAALHDPGFPSYCQSLXILGEKVSEI 

PLTRDNVSEILRCFLMAYGVEPALCDRLRTOPFQAQPPQQKAAV 

IAFLVHELNGSTLIINEIDKTLESMSSYRKNKWIVEGRLRRLKT 

VIAKRTGRSEVEMEGPEECLGRRRSSRiyEVTSGMEEEEEEESI 

AAVPGRRGRRDGEVDATAS S I PELERQ IEKL S KRQL FFRKKLLH 

SSQMLRAVSLGQDRYRRR YWVLP YLAG IFVBGTEGNIiVPEE VIK 
KETDS LKVAAHASLNPALFSMKMEIAGSNTTASS PARARSP PT? K 
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j SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Ammo acid segment containing signal peptide 
(JUAlanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G»Glycine, 
H=Hrstidine, I=Isoleucine, K= Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 










TKPGSMQPRnXKSPVRGQDSEQPQAQLQPEAQLHAPAQPQPQEQ" 

LQLQSHKGFLEQEGSpIiSLGQSQHDIiSQSAFLSWLSQTQSHSSL 

LSSSVLTPDSSPGKLDPAPSQPPEEPBPDEAESSPDPQALWFNI 

SAQMPCNAAPTPPPAVSEDQPTPSPQQLASSKPMNRPSAANPCS 

PVQFSSTPLAGIiAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 

FKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLKALHPR 

G I RE KALHKH LNKHRD FLQE VCLRPS ADP I FEPRQLPAFQEG I M 

SWSPKEKTYETDLAVLQWVEBLEQRVIMSDLQIRGWTCPSPDST 

REDLAYCEHLSDSQBDITWRGRGRBGLAPQRKTTNPLDIAVMRL 

AALEQNVERRYLREPLWPTHEVVLBKALLSTPNOAPEGTTTEIS 

YEITPRIRVWRQTLERCRSAAQVCLCLGQLSRSIAWEKSVNKVT 

CLV^KGDNDEFLLLC^GCDRGCHXYCHRPKMEAVPEGDWPCTV 

CLAQQVEGEFTQKPGFPKRGQKRKSGYSLKFSEGDQRRRRVLliR 

GRES PAAGPRYSEEGLSPS KRRRL9MRNHHSDLTFCEI ILMEME 

SHDAAWP FLEPVNPRLVSG YRR 1 1 KNPMDFSTMRERLLRGGYTS 

SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFFE\SRWEEP 

YQGKQGQSVRQGRWGVTLWHLPPTFQTKTCHFHL1JVILPW0TOV 
RYNPDF 




6957 


82 


3514 


^IVAMPEP-rKKEENEVPAPAPPPEEPSKEKBAGTTPAKDWTLV 
ETPPGEEQAXQNANSQLS ILFIEXPQGGTVKVGEDITFIAKVKA 
EDLSEKPTINGSRKWMDIiASKAGKHLQLKETFERHSRVYTFEMQ 
I IKAKDNFAGNYRCEVTYKDKFDSCS FDLEVHESTGTTPNI DIR 
f SAFKRSGEGOEDAGELDFSGLLKRREVKQGBEEPQVDVWELLKN 
TKP S E YEKIAFOYES PTCSGML KRLKRS I R B EKKSAAPAKI LDP 
VYQVDKGGRVR FWELAD PKLEVKWNKKGQELRPSTKY I FE DTR 
OQS I LNIDNCQMTDDSE YYVTAGDE KCSTBLLVREPP IMVTKQL 
EDTTD YCGER VELECEVS EDDAQVKWFKNGEEI I LVQTR YR I R V 
EGKKHILI IEGATKADAADYS VMTTGGQSS AKLSVDLKPLKILT 
PLTDQTVNLGKEI CLKCEI SENI PGKWTKNGLPVQESDRLKWH 
KGRIHIOiVIDHALTEDEGDYVFAPDAYNVTLPAlCVHVIDPPKII 
LDGLDADNTVTVIAGNKLRLEIPISGEPPPKAMWSRGDKAIMEG 
SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 
VKWDFPDPPVAPTVTBVGDDWCIMNWEPPAYDGGSP1LGYFIE 
KKKKQSSRWMRLNFDLCKETTFEPKKMIEGVAYE VRI PAVNA\ I 
GISKPSMPSRPFVPLAVTSPPTLLTVDSVTDTTVTMRWRPPDHI 
GAAGLDG YVLE YCFEGS TSAKQSDENGEAAYDLPAEDW 1 VANKD 
LIDKTKFTITGLPTDAKIFVRVKAVNAAGASEPKYYSQPILVKE 
I IEPP KIHS P KHLKQTY I RR VGDRVI LVTP FQGKPR PELTWKKD 

GAEIDKNQINIRNSETDTIIFIRKAERSHSGKYDLQVKVDKFVE 
TAS ID IRI IDRPGPPQIVKI EDVWGRNVALTWTPPKDDGNAAlT 
GYTIQKADKKSMEWLRVIEHIIEPVPHTELVIGNEYYFRVFSEN 
MCGLSEDAXMTKESAVIARDGKIYKNPVYBDFDFSEAPMFTQPL 
VNRLCHSG YMATLNCS VRGN PKPKI TWMKNKVAI VDDPR YRM FS 
NQGVCTLE I RKPS PYDGGT YCCKAVNDLGTVEIECKLEVKVIAQ 




6958 
6959 


274 
1 


1663 

1469 |: 


PRTSRVKTEGS QG S SAMD FS VKVD I EKE VTCP I CLELLTE PLS L 
v»\j/4v_a xj\nj.i\E,& v 1 loKGEbSCPVCQTRFQPGNLRPNR 
HLANIVBRVKEVKMSPOEGQKRDVCEHHGKKLQIFCKEDGKVIC 
WVCELS QEHQGHQTFRINE WKE CQEKLQVALQRL I KENQEAEK 
LEDDI RQERTAWKWYI Q I ERQKI LKGFNEMR VI LDNEEQRELQK 
LEEGEVNVLDNLAAATDQLVQQRQDASrLISDLQRRLRGSSVEM 
LQDVIDVMKRSESWTLKKPKSVSKKLKSVFRVPDLSGMLQVLKE 
LTDVQYYWVDVMLNPGSATSNVAISVBQRQVKTVRTCTFKNSNP 
CDFSAFGVFGCQ YFS SGKYYWE VDVS GKIAWILGVHS KI SS LNK 
RKSSGFAFDPSVWYSKVYSRYRPQYGYWIGLQNTCEYNAFEDS 
SSSDPJCVLTLFMAV\LPWLGFS 

^VHWEFGRGIEDFpyLFF<jLTHCQQRICSVTQAGVQWCDHSS 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c orre spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C«Cysteine, D«=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, GsGlycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P»Proline, Q=Glutamine, RaArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *-»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQPQTPGLNQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPPNVT 
WTKtiEDRDGRWPHPQDLIAALPLALVLIiAMRliAFERFIGLPLS 
RWLGVRDOTRRQVKPNATLEKHFLTEGHRPXEPQLSLLAAQCGL 
TLQQTQRWFRRRRNQDRPQLTKKFCEASWRFLFVLSSFVGGLSV 
LYHESWt»WAPVMCWDRYPNQLTLSCPAADSEA\SLYWWYIiLELG 
FYLSLLIRI*PFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HPVAVI IiMTFS YSANLLR IGS I> VLLLHDS SDYLLEACKM VNYMQ 
YQQVCDALFLI FSFVFFYTRLVLFPTQI LYTTYYES I SNRG PFF 
GYYFFNGLLMLLQLLHVFWSCLILRMLYSFMKKGQMEKDIRSDV 
E ESDS SEE AAAAQEFLQLKNGTAGGPR PAPTDGPRS RVAGRLTN 
RHTTAT 


5960 


387 


2068 


AKWARE KEMQEF \TRS FF \RGRPDLSTI*THSI VRRRYLAHSGRS 

P CS D PER KRFR FNS B S ESGS E AS S PDYFG P PAKNGVAS RSHTH P 
KEENPRRA\SKAVEESSDEERQRDIjPAQRGEESSEEBEKGYKGK 
TR KKP WKKQAPGKAS VSRKQAR EESE ESEAE P VQRTAKK VEGN . 
KGTKSLKESEQESEEEILAQKKEQREEEVEEEEKEEDEEKGDWK 
PRTRSNGRRKSAREBRSCKQKSQAKRLLGDSDSEEEQKEAASSG 
DDSGRDREPPVQRKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 
KGSRKMARLGSTSGBESDLBREVSDSBAGGGPQGERKNRSSKKS 
SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGBDHPAVMRLKRYIR 
ACX5AHRNYKKLLGSCCSHKERLSILRAELEALGMKGTPSLGKCR 
ALKEOREEAAEVASLD VAN I T SGS GP PR RT?T AWNTPr.n'F A JV O'Dnt? 

LYRRTLDSDEERPRPAPPDWSHMRGI ISSDGESN 


6961 


340 


1646 


R P WS S PTM KPN FS LRLR I FNLNCWG I P YLS KHRADRMRRLGDFIi 
NQES FDLALLEEVWSEQDFQYLRQKLS PTYPAAHHFRSGI IGSG 
LCVFSKHPIQELTQKIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAE YNRQ KDIYLAHRVAQAWEIAQF IHHTSKK 

adwllcgdij^ihpedlgccllkewtglhdayletrdfkgseeg 
ktmvpkncyvsqqelkpfpfgvr1dyvlykavsgfyiscksfbt 
ttgfdphrgtplsdhealmatlfvrhsppqqnpssthgp\aers 
pl/mcvclkealdgslglgma\qarwwa\tfa\syviglgi*\ll 

LALLCVLAAGGGAGEAAILLWTPSVGLVLWAGAFYLFHVQEVNG 
LYRAQAELQHVLGRAREAQDLGPBPQLYALL\tGQQEGDRTKEQ 


6962 


Mb 


1646 


RPWSSPTMKPNFSLRLRI PNLNCWGIPYLSKHRADRMRRLGDFL 
NQES FDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGI XGSG 
LC VFS KH P IQE LTQHI YTLNG YP YM I HHGDWFS GKAVGLL VLHL 
SGMVliNAYVTHIiHAE YNRQKDI YLAHRVAQAVJELAQFI HHTS KK 
ADVVLLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQEIiKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLSDHEALMATLPVRHSPPQQNPSSTHGPXAERS 
PL/MCVCLKEALDGSLGLGMA\OARWWA\TFA\SYVTGLGI.\r*L 
LALLCVIiAAGGGAGEAAILLWTPSVGL\^WAGAFYl»FHVQEVNG 
LYRAQAELQHVLGRAREAQDLG P E PQ L YALL\ LGQQEGDRTKEQ 


6963 


374 


261B 


RVJPLILKLLKKPKTAENQKASEENEITQPGGSSAKPGLPCLNF 
EAVLS PDPALIHSTHSLTNSHAHTGSSDCDIS CKGMTERIHS IN 
LHNFSNSVLETLNEQRNKGHFCDVTV11IHGSMLRAQRCVLAAGS 
PFFQDKLLLGYSDIEXPSWSVQSVQKLIDFMYSGVLRVSQSEA 
LQILTAASILQIKTVIDECTRIVSQNVGDVFPGIQDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGA WSHHETALGLPRDHHMEDPS WITR IHERS QQMBR YL 
STTPBTTHCRKQPRP VR IQTLVGNIHI KQEMEDDYD YYGQQR VQ 
ILERNESEECTEDfDQAEGTESEPKGESFDSGVSSSIGTEPDSV 
EQQFG PGAARDSQABPTQPEQAAEAPAEGGPQTNQLETGASS PE 
RSNEVEAfDSTVITVSNSSDKSVLQQPSVNTSIGQPLPSTQLYLR 
QTETLTSNLRMPLTLTSNTQVrGTAGNTYIiPALFTTQPAGSGPK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue o£ 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anuno acid segment containing signal peptide 
(A=Alanine, OCyeteine, D^Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=*Glycine, 
H=Histidine, I =Iso leucine, K=*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RaArginine, 
S^Serine, ToThreonine, V* Valine, 
W«Tryptophan, Y«.Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








PFLFSLPQPLAGQQTQFVTVSQPGLSTFTAQLPAPQPLASSAGH " 
STASGQGSKKPTECTLCNKTFTAKOJSIYVKHMFVHTGEKPHQCSI 
CWRS FS LKDYL I K\HMVTHTG VRAYQCS I CNKR PTQKS S LNVHM 
RLHRGEKSYECYICKKKFSHKTLLERHVALHSASNGTPPAGTPP 
GARAGP PGWACTEGTTYVCS VCPAKFDQ I EQFNDHMRMHVS DG 


6964 


1 


178 * 


SGRPFFFFFSNTDVYFIKKVTNRWTAGSSYKMTRMKSIGKILLL 
QIFIG\NCSMFVLVI 




757 


208 


NVFIEPRIQGFMKTSAHPGQKHPDFSMGLLFPLLAALEVCSCGS 
SGSLG YNLPQNH \GIiLGRWTLVLLGQMRRIS PFL CLKDRSDFR F 
PQHKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CCMEHDL 
PGPTPHFTSSAAGTPGDLLGAGDGRRRS WGQWV I EGS TLALRR Y 
FQESISTLE 


6966 


820 


1867 


1 1 TALG VRGM P GCPCPG CGMAG PRI tT .Ft/TA r AT ■ PT .TJZ P arsrs ono" 
ALRS RGTATACRLDNKESESWGALLSGERLDTW XCSLLGSLMVG 
LSGVFPLLVI PLEMGTMLRS EAGAWRLKQLLS FALGGLLGNVFL 
HLLPEAWAYTCSASPGGEGQSLQQQQQLGLWVIAGILTFLALEX 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRP LSGPAGCRAR PRCR 
GP\DIKVSGYLNLrAKTIDNFTHGLAVAASPLVSKKIGLLTTMA 
ILLIIElPHEVGDFAILLRAGFDRWSAAKLQLSTALGGUiGAGFA 
ICTOSPKGVEETAAWVLPPT3GGFLYIALVNVLPDLLEEEDPW 


6967 


162 


633 


GFLPFKYWILDLSASSRMETDCNPMELSSMSGFEEGSELNGFEG 
TDMKDMKIiEAEAWND VLFAVNNM FVS KSLRCADDVAYINVETK 
ERNRYCLELTEAGLKVVGYAFDQVDDHLQTPYHETVYSLLDTL\ 
S P AYREAFGKR \ LLQRLEALKRDGQS 


6968 


1 


2265 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGLQKT 
LEQFHL3SMSSLGGPAAFSARWAQEAYKKESAKEAGAAAVPAPV 
PAATEPPPVLHLPAIQPPP PVLPGP FFMPSDRS TERCET VLEGE 
TI SCFWGGEKRLCLPQILNS VLRD FSLQQ INAVCDELH I YCSR 
CTADQLEILKVMGILPFSAPSCX3LITKTDAERLCNALLYGGAYP 
PP CKKELAASLALGLELS E RS VRVYHE\CFGKC KGL \ L VP ELYS 
SPSAACIQCLD\ CRLMYP PHKFWHSHKALENRTCHWGF \DSA\ 
NWRAYI LLSQDYTGKEBQARLGR \ CLDDVKEKFD YGNKYKRRVP 
R VSS E P PAS I R P KTDDTS SQS PAPSEKDKPS S WLRTLAGS SNKS 
LGCVHPRQRLSAFRPWS PAVSASEKELSPHLPALIRDSFYSYKS 
FETAVAPNVAIiAP PAQQKWSSP PCAAAVSRAP E PLATCTQPRX 
RKLT\n5TPGAPBTLAP VAAPEEDKOSEAEVE VE S REE FTS SLS S 
LSSPSFTSSSSAXDLGSPGARALPSAVPDAAAPADAPSGLEAEL 
EHLRQ ALEGGLD TKEAKE KFLHE WKMRVKQEE KLS AALQAKRS 
LHQ ELE FLRVAXKEKLREATEAKRNLRKE I ER1RAE NE KKMKEA 
NESRLRLKRELEQARQARVCDKGCEAGRLRAKYSAQIEDLQVKL 
QHAEADRE<2LRADLLREREAREHliEK\VVK\ELQEQLWPRARPE 
AAGSEG\AAELEP 


6969 


1855 


118 


AGTMHGRLKVKTSEEQAEAKRLEREQKLKLYQSATQAVFQKRQA 
GBT.DESVLELTSQI LGANPDFATLWNCRREVLQQLETQKS PEEL 
AALVK^ELGPLESCLRVNPKSYGTWHHRCWLLGRLPEPNWTREL 
ELCARFLEVDERNFHCWDYRRFVATQAAVPPAEELAFTDSLITR 
NFSNYSSWHYRSCLLPQLHPQPDSGPQGRLPEDVLLKELELVQN 
AFFTDPNDQSAWFYHRWLLGRADPQDALRCLHVS RDEACLT VS F 
SRP LL VGSRME I LL LMVDDS PL I VE WRTPDGRNRPSHVWLCDL P 
AAS LNDQL PQH7FRVI WTAGDVQKE C VLLKGRQEGWCRDS TTDE 
QLFRCELSVEKSTVLQSELESCKELQELEPEWKWCL\LTI ILLM 
RALDPLLYEKETLQYFQTLK\^W0PKRATY\LDDLRSKFLLENS 
VLKMEYAEVRVLHLAHKDLTVLCHLEQLLLVTHLDLSHNRLRTL 
PPAIiAALRCLED PP PRT \ VLQASDNAI ESLDG VTNLPRLQELL L 
CNNRLQQ PAVLQ PLASCPRLVLLNLQGNPLCQAVG I LEQLAELL 
PSVSSVLT 
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SElQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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Amino acid segment containing signal peptide 
{A«Alanine, C=Cy3teine, D^Aspartic Acid* E» 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
HeHistidine, I«lsoleucine, K=Lysine, 
Ii=Iieucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreortine, V»Valine, 
W=»Tryptophan, Y=Tyroaine, X»Un)tnown, *«Stop 
Codon, /-possible nucleotide deletion, 
Vpossible nucleotide insertion) 


6970 


3 " 


152 - B 


GFLSRISGLLIiCRWTCRHCCQKCYESSCCQS SEDEVBI LGPFPA 
QTPPWLMASRSSDKDGDSVHTASEVPLTPRTNSPDGRRSSSDTS 
KSTYSLTRRISSLBSRRPSSPLID1KPIBFGVLSAKKBPIQPSV 
LRRTYNPDDYFRKPEPHLYSLDSNSDDVDSLTDEBILSKYQLGM 
LH PSTQ YDLLHNHLTVR V I EARDLPPP I SHDGSRQDMAHSNPY V 
KI CLLPDQKNS KQTGVKRKTQKP VFEERYTFE I PFLEAQRRTLL 
LTWDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALI PSSQNE 
VELGELLLSIiNYLPSAGRLNVDVIRAKQLLQTDVSQGSDPFVKI 
QLA^HGLKLVKTKKTSFLRGTIDPFYNESFSFKVpQEELENASLV 
FTVFGHNMKSSNDFIGRIVIG\QYSSGP\SEPNHWRRMLNTHRT 
AVEQWHS LRSRAECDRVS PASLEVT 


6971 


37 


3702 


ACFYVPGSRSFKLIPRHGLVNMGRSGKLPSGVSAKLKRWKKGHS ' 

SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKLHNELQSGSL 

RLGKS EAP ETPMEEEAE LVXjTEKS SG TFLSGLS DCTNVTFS KVQ 

RFWESNSAAHKEICAVLAAVTBVIRSQGGKETETEYFAAliIRKA 

AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIBKSGGSK 

EATTTLHMLTIiLKDLL P CFPEGLVKS CS ETLLR VMTLSHVLVTA 

CAMQAFHSLFHARPGLSTLSAELNAQIITALYDYVPSENDLQPL 

LAWLKVMEKAHINIiVRLQWDLGLGHLPRFFGTAVTCLLSPHSQV 

LTAATQSliKE I LKEC VAPHMADIG S VTSSASG PAQS VAKMFRAV 

EEGLTYKFHAAHSSVLQLLCVFFEACGRQAHPVMRKCIiQSLCDL 

RLS PHFPHTAALDOAVGAAVTSMGPE VVLQAVPLE I DGSEETLD 

FPRS WLLPVIRDHVQBTRLGFFTTYFLPLANTUCS KAMDLAQAG 

STVESKIYDTLQWQMWTLLPGFCTRPTDVAISFKGLARTLGMAI 

SERPDLRVTVCQALRTL ITKGCQAEADRAEVSRFAKNFLP I LFN 

LYGQ PVAAGDTPAPRRAVLETIRT YLTITDTQLVNSLLEKAS EK 

VLDPASSDFTRLSVLDLWALAPCADEAAISXLYSTIRPYLESK 

AHGVQKKAYRVLEEVCASPQGPGALFVQSHLEDLJCKTLLDSLRS 

TSSPAKRPRLKCLE.HIVRKLSAEHKEFITALIPEVILCTKEVSV 

GAR KNAFALLVE MGHAFLR FG SN Q EE ALQ CYJUVL I Y PGL VG AVT 

MVS CS I LALTHLLFEFKGIiMGTSTVEQLLENVCLLLASRTRDW 

KSALGFI KVAVT VMDVAHLAKHVQLVMEAIGKL S DDMRRHFRMK 

LRNLFT\KFIPK\FGILTWGKKAVGPKEYHRVLVNIRKAEARAK 

RHRALSQAAVEEEEEEEEBEEPAQGKGDSIEEILAD3EDEEDNE 

EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVIA 

TQPGPGRGRKKDHSFKVSADGRLIIREEADGNKMEEEEGAKGED 

EEMADPMEDVI 1RNKKHQKLKHQKEAEEEELE I PPQYQAGGSGI 

HRPVAKK" AMPftA R VTf V 21 YCXTWrtC WT3D TDTiDVn V T r>T \TD O irr md 
" *** v "*v ivrU'i irvjrvc* i xvrtiv n_n JVLrtv V S\ A. J\Jj Kir LJzfxJ\i± YLtvt Ko K LtVi K 

RKKMKLQGQFKGLVKAAQRGSQVGHKNRRKDRRP 


6 972 


2179 


973 


PGGAILLPLWRRTRPREATVPRGAAQRGRARSAEGRIPSSQSPS 
PAEAGGATRSPP PRPPRPARPPGPSAPPLLRSDAGPGATVSAAA 
AAATERARRGATMGAQLSTIjGHMVLFPVWFLYSLLMKLFQRSTP 
AITLESPDIKYPLRLIDREIISHDTRRFRFALPSPQHILGLPVG 
QHIYLSARIDGNLWRPYTP1SSDDDKGFVDLVIKVYFKDTHPK 
FPAGGKMSQYLESMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
SNP 1 1 RTVKSVGM IAGGTGI T PMljQV I RAI M KDPDDHTVCHL L F 
AN0TEKDILrj?PEI^IJ?NKHSARFKLWrriiDRAPBAWI3YGQG\ 
FVNEEM IRDHLPPPE\EE PLVLMCG P PPMIQ YACL PNL \DHVGH 
PTERCFVF 


697* 


1 


1364 


LQPRCAHRGLRAQKCGRPAPG VDAMVLC P VI GKLIiHKRWLAS A 
SPRRQE ILSNAGLRFE WPS KFKEKLDKAS FATP YG YAMETAXQ 
KALEVANRLYQKDLRAPDVVIGADT3VTVGGLILBKPVDKQDAY 
RMLSRFE/SGREHSVFTGVAIVHCSSKDHQLDTRVSEFYEBT1CV 
KF5 ELSEEUiWEYVHSGEPMDKAGGYGIQALGGMLVESVHGDFI* 
NWGFPLNHFCKQLVKLYYPPRPEDLRRS VKHDS I PAADTFBDL 
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Wo Tryptophan, Y=Tyrosine, X«Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDVEGGQSEPTQRDAGSRDEKAEAGEAGQATAKAKCHRTRETLP 
PFPTRLLBLIEGFMLSKGLLTACKLKVPDLLKDEAPQKAADXAS 
KVDASACGMERLLDICAAKGLLEKTEQGYSNTETANVYLASDGE 
YS LHG F I MHNNDLTWNLFTYLE FAIREGTNQHHRALG KKAE DI» F 
QDAYYQSPETRLRFMRAMHGMTKLTACQVATAFNLSRFSSACDV 
GGCTGALARELARE YPRMQVTVFDLPD 1 1 ELAAHFQ PPG PQAVQ 
I H FAAGDFFRDP LPS AELYVLCR I LHDW PDDKVHKLLS RVABS C 
KPGAGLLLVETLLDEEKRVAQRALMQSLNMLVQTEGKERSLGEY 
Q CLL BLHGFHQ VQ WHLGGVLDA IL\ P PKW P P EAQAACSL 


6974 
" 6975 


3082 


2172 


R S CAA FAS FAS R P PI* ELFAP PGSHRS P PGRG VATS AQ CALS VRK 
LLAARPGTX3T KYQATMVYKTIiFALCI LTAGWRVQS L P TS APLSV 
SLPTNIVPPTTIWTSS PQNTDADT AS P SNGTHNNS VL P VTAS A P 
TSLLPKNISISSREEEITSPGSNWEGTNTDPSPSGFSSTSGGVH 
LTTTLEEHSIiGTPEAGVAATLSQSAASPPTLISPOAPASSPSSL 
STS P PEVFS AS VTTNHS STVTSTOPTGAPTAPF Q PTFP ontrr 

PTSHATAEPVPQEKTPPTTVSGKVMCELIDMET\PPPFPG 




2 


500 


RPRPTVHCCKWALKI^TAMBTLINVFHAHSGKEGDKYiCLSKKEL 
KSLLQTELSGFLDVKELML*ATEALKTFEEA+KSPIIQCSSSRS 
SLP PAPQPPPYL* LSAVPFP IHLPLPLLPPQAQKDVDAVDKVMK 
BLDENGDGEVDFQEYVVLVAALTVACNNFFWEfTS 


6976 


1216 


970 


j*ejv*oc y i.ei\nrfciUL V&UH-'^o VGRIMPHTEARI 
MNMEAGTLAKLNTPG ELCIRGYC VMIX3YWGEPQKTE RAVDQDKW 
YWTGDVATJV: NEQGFCKI VGRS KDMI TUflflVNT vdspt on it nrrmt 
P KVQEVQ WGVKDDRMGEE ICAC I RLKDGEETTVEE I KAFCKGK 
I SHFKI P KY I VFVTNYPLTISGKI QKFKLREQMERH LNL * IKQQ 
ACPGRLA 


6977 


1298 


589 


S LF I NTNLLS NQ I RKTS FGMCSE PIS DNTEDQ KG KLKTPDFA* R 
ANKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVLKG 
VYSTQVGFAGGYTSNPTYKEVCSEICrGHAEVVRWYQPEHMSFE 
ELLICVTWENHDPTGX5MR0^3NDHGTQYRSAIYPTSAKQMEAALSS 
KENYQKVLS EHGFGPITTDIRBGQTF YYAEDYHQQYLSKNPNGY 
CGLGGTGVSCP VGI KK 


6978 


3 


242 


SFPFRDSRRCGCCKGSSLRHTAVAMVKl^KEAKQRJjQQLFKGSQ - 
FAIRWGFIPLVIYLGFKRGADPGMPEPTVLSLLWG 


6979 
6980 J 


3917 
1 


2146 
420 


DEARVRGEAVAAAILSRCRHWSGPPPFPPSPPDRKGLRGTEPWE 
AG PGSGATPGARAMDVRRLKVNELR E ELQRRGLDTRGLKTELAE 
RLQAALEABEPDDERELDADDEPGRPGHINEEVETEGGSELEGT 
AQ PP PPGLQPHAEPGG YSGPDGH YAMDNI TRQNQF YDTQ VI KQE 
NESG YERRPLEMEQQQAYRPEMKTEMKQGAPTS FLPPEASQLKP 
DRQQFQSRKRPYEBNRGRGYFEHREDRRGRSPQPPAEEDEDDFD 
DTLVAIDTYWCDLHFKVARDRSSGYPLTIEGFAYLWSGARASYG 
VRRGRVCFEMKINEEISVKHLPSTEPDPHWRIGWSLDSCSTQL 
GE EP FS YG YGGTGKKSTNS RFENYGDKFAENDVIGCFADFECGN 
DVELS FTKNGKWMGIAFR IQKEALGGQALYPHVLVKNCAVEFNF 
GQRAEPYCSVLPGFTPIQHLPLSERIRGTVGPKSKABCEILMMV 
GLPAAGKTTWAI KHAASN PS KKYN ILGTNAIMDKMRVMGLRRQR 
NYAGR WD VI#I QQATQCLNRLIQIAARKKRNYI LDQTNVYGSAQR 
RKMRPFEGFQRKAIVICPTDEDLKDRTIKRTDEEGKDVPDHAVL 
EMKAN FTLPDVGDFLDEVLFI ELQRE EADKLVRQYNE EGRKAG P 
PPE KR F DNRGGGGFRGRGGGGG FQR YENRG PPGGNRGGFQNRGG 
GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDNNNSNNRGSYKRA 
PQQQPPPQQPPPPQPPPQQPPpppsYSPARNPPGASTYNKNSNI 
PGSSANTSTPTVSS YSPPQS FGFFPS TFQ PS YSQP P YNQGG YSQ 
GYTAPPPPPPPPPAYNYGSYGGYNPAPYTPPPPPTAQTYPQPSY 
NQ YQQ YAQQWNQ Y YQNQGQ WP PY YGNYDYGSYSGNTQGGTS TQ 
GTRGRKTGR VAAP STRRRTGNMQKLQTRS PAMSLS DPGLG YKPT 
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W=Tryptophan, Y~Tyroaine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








CKTi^WPPLCSbHALHVFHCLFSSRLGTPVSPRLAMDliMtSeBA 

GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6981 


10 


1054 

| 128 ^ 


PGRGFRRASLRPAFAARGVFQGGLGQAKQARTRACAALPTPHPS " 
APRLLEPQGVFSLFPPPPGPWPNMILTKAQYDEIAQCLVSVPPT 
RQSLRKLKQRFPSQSQATLLS I FSQE YQKHI KRTHAKHHTSEAI 
ES YYQRYLNGWKNGAAPVI*liDLANBVDYAPSLMARLI LERFLQ 
EHEETP PS KS I INS MLRDPSQ I PDGVLANQ VYQC IVNDCCYGP1> 
VDCI KHA IGHEHBVLLRDLLIiE KNLS FLDBDQLRAKG YDKTPDF 
ILQVPVAVEGHIIHWrESKASFGDECSHHAYLHDQFWSYWNRFG 
PGLVI Y WYG F I QELD CNRERG ILLKAC FPTNI VTLCHS I A 


6982 


1S3"" 




FPQQDCSAPAAPGLAGSEPRRLRAYRRRRQRARGLKRVAWLAPP 
PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETW.ILHPVIKAFLCGSISGTCSTLLFQPLDLLKTRLQTLQ 
PSDHGSRRVGMIAVLLXWRTESLLGLWKGMSPSIVRCVPGVGI 
YFGTLYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMSPITVI 
KTR Y ESG KYG YES I YAALRS I YHS EGHRGLFSGLTATLLRDAP p 
SG I YLMFYNQTKN I VPHDQVDATL I PlTNFS CG I FAG ILASLVT 
QPADVT KTHMQLY PLKFQWIGQAVTL1 PKDYGLRGFFQGGI PRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 


6983 


82 


773 


BMSFIXJDPSFFTMGMWSIGAGAI^AAALALLIiANTDVFLSKPQK 
AALEYXiEDIDLKTLEKEPRTFKAKELWEKNCiAVIMAVRRPGCFL 
CREEAADLSSLKSMLDQLGVPLYAWKEHIRTEVKDFQPYFKGE 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGNLE 
GEGFILGGVF WGSGKQG ILLEHREKBFGDKVNLLSVLEAAKMI 
KPQTLASEKK 


6984 


1845 


1282 


GGRS AYSLPAGS LPRVPATAAAKMASG VQVADE VCRI FYDMKVR 

KCSTPEEIKKRKKAVIFCLSADKKCirVEEGKEILVGDVGVTIT 

DPFKHFVGMLPEKDCRYALYDASFETKESRKEELMFFLWAPELA 

PLKSKMIYASSKDAIKKKFQGIKHECQANGPEDLNRACIAEKLG 
GS&IVAFEGCPV 


6985 


1887 


1324 


RRTAGIYPCFPKPGRTRHALCSWLLLliTGQLAFDDFQESCAMM 
WQKYAGSRRSMPLGARILFHGVFYAGGFAIVYYLIQKFHSRALY 
YKLAVEQLQSIIPEAQEALGPPLNIHYLKLIDRENFVDIVDAKLK 

IPVSGSKSEGLLYVHSSRGGPFQRWHLDEVFLELKDGQQIPVFK 
LSGENGDEVKKE 


69B6 


642 


1350 


YHLYFKMGDPNSRKKQALNRl^QLRKKKESLADQFDFKMYIAF 
VFKEKKX KSAL F E VSEVI P VMTNNYEEN I LKG VRDS S YS LE S S L 

ELLQKDWQLHAPRYQSMRRDVIGCTQBMDFILWPRNDIEKIVC 
LLFSRWKESDEPFRPVQAKFEFHHGDYEKQFLHVLSRKDKTGIV 
VNNPNQSVFLF I DRQHLQTPKNKATI FKLCS ICLYLPQEQLTHW 
AVGTIBDHIiRPYMPE 


6987 


1623 


341 


LEAAEKASRAFKESQRQTDS KNYETEN WSPUKSQRR YDM Ytf TAc 
FLG E 1 E VGLYT I Q I LQLTPF FHKENBLS KKHMVQFLSGKWT I P p 
DPRNECYIALSKFTSHLKNLQSDLKRCFDFFIDYMVLLKMRYTQ 

¥TF?T APT MT ..Q V WQ PHITP Wtc»t nnr»T t ncyrmfrw *> mp««m*_ 
4V " A * VDX * »aow\voiw.ckm i tit Lit ^rtLiJUr ^ijijvoKiiSQIjLiQBENC 

RKKLEALRADRFAGLLE YLNPNYKDATTMBS I VNE YAFLLQQNS 
XKPMTNE KQNS ILAN I ILSCLKPtJSKLI Q PLTTLKKQIiRE VLQ F 
VGLSHQYPGPYFIiACLLFWPENQEUDQDSKLIEKYVSSLKRSFR 
GQYKRMCRSKQASTLFYLGKRKGLNSIVHKAKIEQYFDKAQNTN 
S L WHSGDVW KKNE V KDLLRi? L TGQAEGJCL IS VE YGTE BKIKXPV 
ISVYSGPLRSGRNI ERVSFYLGFS IEGPPGL 


6988 


3 


689 


TQLLRRPAVFVGSAASGI RSGLWS ASSGHWCAP AAGRAHAP VPR 
LVRGLGAASTAAPQDAQTGPQPMPRADCIMRHLPYFCRGQWRG 
FGRGS KQLG I PTAN F PEQWDNIiPADl STGI YYGWAS VGSGDVH 
KMWS 1 GWNP YYKNTKKSMBTH I MHTFKEDF YG EILNVAI VG YL 
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Amino acid segment containing signal peptide 
<A=Alanine, O-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=»Asparagine , 
P^Proline, Q=Glutaroine, R=Arginine, 
SaSerine, T=Threonine, VcValine, 
W=Tryptophan, Y«Tyroeine, X=»Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








RPSKNFDSLESLISAIQGDIEBAKKRLBLPEHLKIKEDNFFQVS 
KSKIMNGH 


6989 


2 


1118 


IMP SDRPLS PSTHASAGSHCHAPPTTARRAFP I PFGSKSNMATL 
KDQL I YNLLKEEQT PQNKITWG VGAVGMACAIS I LMKDLADE L 
ALVDVI EDKLKG EMMDLQHGS LFLRTPK I VS G KDYNVTANSKLV 
IITAGARQQEGBSRLNLVQRNVNIFKPIIPNVVKySPNCKLIiIV 
SNPVDILTYVAWKISGPPKNRVIGSGCNLDSARFRYLMGERLGV 
HPLSCHG WVLGEHGDSS VPVWSGMNVAGVS L KTLHPDLGTDKD K 
EQWKEVHKQWESAYEVIKLKGYTSWAIGLSVADLAESIMKNLR 
R VHP VS TM I KGIi YGI KDDVFLSVPCILGQNGISDLVKVTLTSEE 
EARLKKSADTLWGIQKELQF 


6 990 


719 


258 


THASGMASWLALRTRTAVTSLLSPTPATALAVRYASKKSGGSS 
KNIX3GKS SGRRQG I KKMEGHYVHAGNI I ATQRHFRWHPGAHVG V 

GKNKCLYALEEGIVRYTKEVYVPHPRNTEAVDLITRLPKGAVLY 
KTFVHWPAKPEGTPKLVAML 


6991 


169 


4S1 


RRSSDFHNPGFLSRPVSLRENIHHQVlCSTKNkRRl^PKiCIA'fi'LL 
S S Li LMTNLN PNESTENQP VDAYWAFTLDQE FLTYACVEGTGCLF 
CGRHVH 


6992 


944 * 


510 


RQAPGCS S LALRQVRQVYCGLVRAPQVQTRPLSSRFVERRGALY ' 
RSPK3NQENPPPYPGPGPTAPYPPYPP0PMGPGPMGGPYPPPQGY 
PYQGYPQYGWQGGPQEPPKTTVYWEDQRRDELGPSTCLTACWT 
ALCCCCLWDMLT 


6993 


1 


374 


QWCVTCPQHNARQGPAVPPdlQAYGAAP^KDLQVDFTEMSKCRG 
DRVWIKNWNVASLCPLWKGPQTWLSPPTAVXVEGI PAWIHHSH 
VKPAARETWEARPS PDNP FR VTLKKTTSPAP VTPGS 


S994 


346 


1100 


QW P EKD P VMAASS IS S PHGKHVF KAI IiMVLVAL I LLHSALAQSR 
RDFAPPGQQKREAP VDVLTQI GRS VRGTLDAWI GPETMHLVSES 
S S QVLWAX SS AI SVAF FALSG IAAQLLNALGLAGDYLAQGLKLS 
PGQVQTFLLWGAGALVVYWLIiSLLLGLVLALLGRILWGLKLVI F 
IiAGFVALMRSVPDPSTRALIiLLALLILYALLSRLTGSRASGAQL 
EAKVRGLERQ VE ELRWRQRRAAKGARS VEEE 


S99S - 


144 - " 


1346 


GS VA VGLSG 1 MAAQ KD LWDAI VIGAG IQGCFTAYHLAKHRKRIL 
LLEQFFLPHSRGS SHGQSR IIRJKA YLBDF YTRMMHECYQI WAQI> 
EHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 
EELKQRFPNIRLPRGEVGCLDNSGGVIYAYKALRAIiQDAIRQLG 
GI VRDGEKWEINPGLI>VTVKTTSRS YOAKSLVI T7AGP WTNQLL 
RPLGIEMPLQTLRIWVCYWREMVPGSYGVSQAFPCFLWLGLCPH 
HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQIL 
SSFVRDHLPDLKPEPAVIESCMYTNTPDEQFIItDRHPKYDNIVl 

GAGFSGHGPKLAPWGK1LYELSMKLTP3YDLAPFRISRFPSLG 
KAHL 






1942 


ETANAEAAARKSAMDWKEVLRRRLATPNTCPNKKKSEQELKDEE 
MDLFTKYYSE WKGGRKNTWE FYKTI P RFYYRLPAENEVLLQKLR 

BESRAVFLQRKSRELLDNEELQNLWFLLDKHQTPPMIGEEAMIN 
YENFLKVGE KAG AKCKQ F FTAKVF AKLI ,HTD Yf3R T e tmopctjv 
VNRKVWLHQTR IGLSLYLVAGQG YLRESDLENY ILELIPTLPQL 
DGLEKSFYSFYVCTAVRKFFFFLDPLRTGKIKIQDILACSFLDD 
LLELRDEELS KESQETNWFS A PSALR VYGQ YLNLDKDHNGMLSK 
EELS RYGTATMTIJVFLDRVFQECLT YDGEMD YKTYLDFVLALEN 
RKEPAAI^YIFIOjLDIEWKGYLNVFSLWyFFRAIQELMKlHGQD 
PVSFQDVKDEI FDMVKPKD PLKISLQDLINSNQGDT VTTIL IDL 
NGFWTYENREALVANDSENSADLDDT 


6997 


370 


1104 


AMELTIFILRLAI YILTFPLYLDNFLGLWSWI CKKWFP YFLVRF ' 

TVIYNEQMASKKRELFSNLQEFAGPSGKLSLLBVGCGTGANFKF 

YPPGCRVTCIDPNPNFBKFLIKSIAENRHLQFERFVVAAGENMH 
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Araxno acid segment containing signal peptide 
(A-Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, PoPhenyl alanine, G=Glycine, 
H=Histidine, I-=Isoleucine, K«Lysine, 
LaLeucine, MeMethionine , N=Asparagine, 
PsProline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, VWaline, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) 








QVAEGSVDVWCTLVLC3VKNQERIL.REVCRVLRPGGAFYFMBH 
VAAECSTWNYFWQQVLDPANHLLFDGCNLTRESWXALERASFSK 
LKLQHIQAPLS WELVRPH I YGYAVK 


6998 


2 


616 


F VSRALLRVR SRRHPAE E RAAPGR PEDAP IECPGATNCP EPLWC 
SHLP VPYAP PTME S RGKS AS S PKPDTKVPQVTTEAKVPPAADGK 
APLTKPS KKEAPAEXQQP PAAPTTAPAKKTSAKADPALLNNHSN 
LKPAPTVPSSPDATPEPKGPGDGAEEDEAASGGPGGRGPWSCEN 
FNPLLVAGGVAVAAIALILGVAFLVRKK 


6999 


14 


X J J X 


vjKHUftUoKxCLJ X AMb iGlCoiUV IKJj_MU I IjlUiNtoLHRAJjATLQE 

ElTVSLNTVDSIBSFVADINSGHWIJrvLQAIQSLKLPDKTLIDL 
YEQVVLELIELRBI^GAARSLLRQTDPMIMLKQTQPERYIHLENL 
LARSYFDPREAYPIXSSSKEKRRAAIAQALAGEVSVVPPSRLMAL 
LGQALKWQQHQGLLPPGMTIDLFRGKAAVKDVEEEKFPTQLSRH 
I K FGQ KSKVECARFSPDGQ YLVTGS VDGF IE VWNFTTG K IRKDIi 
KYQAQDNFMMM DDAVI>CM CFSRDTEMIATGAQDGKI KVWK IQSG 
QUiKKF SKAno WjVTv-lio r StUJSSQlljSASrDQTIRIHGLKSGK 
TLKEFRGHSSFVNEATFTQDGHY I IS ASSDGTVKI WNMKTTECS 
NTFKSLGSTAGTDITVNSVILLPKNPEHFWCNRSJTTVVIMNMQ 
GQIVRSFSSGKREGGDFVCCALSPRGEWIYCVGEDFVLYCFSTV 
TGKLERTLTVHEKDVIGIAHHPHQNLIATYSEDGLLKLWKP 


7000 


2 


827 


GPGWFLELMESEGPPESERSEFFSQREEENBEEEAQEPEETGP 
KNPLLQPALTGDVEGIiQKlFEDPENPHHEQAMQLLIiEEDIVGRN 
LLYAACMAGQSDVI RALAKYG VNLNE KTTRGYTLLHCAAAWGRL 
ETLKALVELDVD I EALNFREERARDVAAR YSQTECVE FLDWADA 
RLTLKKY I AKVS LAVTDTEKGSGKLLKEDKNTI L S ACRAXNEWL 
a Xtfi £>/u LriZLic r. yKv^Ji-JVU ivlrlc x Wl 1 1 r V.Q VKS AKS VTSH 
DQKRSQDDTSN 


7001 


2056 


844 


RRCLI IAFLKGCFIFIYFIFI FETEFLSCCPGWSAVAQSRLI AN 
PASQVQAIFILPKDSQVGPDVKSEAAPKRALYESVFGSGE I CGP 
TSPKRLCIRPSEPVDAWWSVKHDPLPLLPEANGHRSTWSPTI 

AP VHI DVGGHM YTSS LATLTK Y PES R I GRLFDGTEP I VLDS LKQ 
HYFIDRDGQMFRYIIOTLRTSKLLIPDDFKDYTLLYEEAKYFQL 
QPMLLEMERWKQDRETGRFSRPCECLWRVAPDLGERITLSGDK 
SLIEEVFPEIGDVMC^SVNAGWNHDSTHVIRFPLKraYCHIiNSVQ 
VLERLQQRGFEIVGSCGGGVDSSQFSEYVIiRRELRRTPRVPSVI 
RIKQEPLD 


7002 


1043 


498 


PMPSSTRWTTS*TYTDTSSAWACRPTTGTCr*TAAPGPTVRWWP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTSAGrSWPAGRRTGTATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRS CCSRPATTP PSKPGAPHAPCASSRHLAHGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAF CWQRDFLQP PGMRLS ALLAIiAS KVTLPPH YR YGMS P P 
GSVADKRKNPPWIRRRPVWE PI SDEDWYLFCX3DTVE ILEGKDA 
GKQGKWQVI RQRNWWVGGLNTHYR YIGKTMDYRGTMI PSEAP 
LLHRQVKLVDPMDRKPTEIEWRFTEAGERVRVSTRSGRI I PKPE 
FPRADG IVPBTWIDGPKDTSVEDALERTYVPCLKTLQEEVMEAM 
GI KBTR \NTRRS I G I E PGAEQLLPNFCPS LEG 


7004 


121 


2285 


FLLPVLTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G\ PKRHiKTQLG/ Y YCRVR PLGFPDQECC I EVINNTTVQLHTPE 
GYRLNRNGD YKETQ YS FKQVFGTHTTQKELFD WAN PLVNDL IH 
GKNGLLFTYGVTGSGKTHTMTGS PGEGGLLPRCLDM I FNS IGSF 
QAKRYVFKSNDRNSMDIQCEVDALLERQKREAMPNPKTS S SKRQ 
VDPEFADMITVQEFCKAEEVDEDSVYGVFVSYIEIYNNYIYDLI* 
EE VPFDP I NPNLHNLNCFVKI KNHNM YVAGCTEVE VKSTEEAFE 
VFWRGQKKRRI ANTHLNRESSRSHSVFNI KLVQAPLDADGDNVL 
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cor re sponding 

to first 

amino acid 

residue o£ 

amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H=Histidine, I^Isoleucine, K»Lysine, 
L= Leucine, M=»Methionine, N^Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonina, V«Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEKEQITISQLSLVDLAGSERTNRTRAEGNRLREAGNINQSLMT " 

LR T CM D VL RENQMYGTN KMVP YRDS KLTHL PKN YFDG EG KVRM I 

VCVNPKAEDYEKNLQVMRFAEVTQEVEVARPVDKAICGLTPGRR 

YRNQPRGP\ IGNEPLVTDWLQSFPPLPSCE ILDINDEQTLPRL 

IBALEKRHNLRQMMIDEFNKQSNAFKALLQEFDNAVLSKENHMQ 

GKLNEKEKMISGQKLBIERLEKKNKTLEYKIEILEKTTTIYEED 

KRNLQQELETQNQKLQRQFSDKRRLBARLQGMVTETTMKWBKEC 

ERRVAAKQLEMQNKLWVKDEKLKQLKAIVTEPKTBKPERPSRER 

DREKVTQRSVSPSPVPVSYL 


7005 " 


63 


87* 


RNMALYQRWRCLRLQGLQACRLHTAWSTPPRWLAERXjGtPEEli 
WAAQ VKKLASMAQKE PRTI K I SLPGGQKI DAVAWNTTP YQLARQ 
I S STLADTAVAAQVNGEPYDLERPLETDSPIiRFLTFDS PEGKAV 
PWHSSTHVLGAAABQFLGAVLCRGPSTEYGFYHDPPLGKERTIR 
GSELPVLERICQELTAAARPFRRLEA3RDQLRQLPKDNPFKLHL 
IEEKVTGPTATVYGCGTLVDLCQGPHLRHTGQIGGIiKLLSNSSS 
LWRSSG 


7006 


22 


898 


NAFGRHSTAVKMAAAAWLQVLPVI LIJjIiG AHPS PL>S PFS AGPAT 
VAAADRS KWH 1 PI PSGKNYFSFGKILFRNTTI FLKFDGEPCDLS 
LNITWYLKSADCYNEIYNFXAEEVELYLEKLKEKRGLSGKYQTS 
S KLPQNCS ELPKTQTFSGDFMHRLPIiLGB KQEAKENGTNLTFIG 
DKTAMHEPLQTWQDAPYTFIVHIGISSSKESS KENSLSNLF1WT 
VEVKGPYB YLTLEDYPLMIF FMVMCI VWLFGVLWLAWSACYWR 
DLLRIQFW IGAVI FLGMLEKAVPYAGFQ 


7007 


2 


1001 


AMTVSGPGTP EPR PATPGAS S VEQLRKEGNE LFKCGD YGGALAA 
YTQALGLDATPQD QAVLHRNRAACHLKLED YDKAETEAS KAIE K 
DGGDVKAL YRRSQALEKLGRLDQAVLDLQRCVSLEP KNKVFQEA 
LRNIGGQ I QEKVRYMSSTDAKVEQMFQIItLDPEEKGTEKKQKAS 
QNLWLAREPAGAEKI FRSNGVQL.LQRLLDMGETDLMLAALRTL 
VG I CSEHQ SRTVATLS I LGTRRWS ILGVES QAVSLAACHLLQV 
MFDALKEG VKKX3PRGKEGAI I VGE^Q VWGLLDVTVWEGMGI*S Q 
PGQFFGDQTCSCRLFGIRFGDI I JUL 


7008 


70 


1478 


CRSALCHERPPPAHLPAGGRRLQTCPRSCRWLGRPPSGLPPGPR'" 
S P P PLAG PGQKM VQKKPAELQG FHRS FKGQNPFELAFSLDQ PDH 
GDSDFGLQCSARPDMPASQPID I PDAKKRGKKKKRGRATDS FSG 
RFEDVYQLQEDVLGEGAHARVQTCINLITSQEYAVKIIEKQPGH 
I RSRVFREVEMLYQCQGHRNVLELI E F FEE EDRFYLVFE KMRGG 
S rLSHIHKRRHFNELEASVWQDVASALDFLHNKGIAHRDLKPE 
NIIiCEHPNQVSPVKICDFDLGSGIKliNGDCSPISTPELLTPCGS 
AEYMAPEWEAFSEEASIYDKRCDLW5LGVrLYILLSGYPPPVG 
RCGSDCGWDRGEACPACQNMLFES IQEGKYEFPDKDWAHI SCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 
WDSHPLLP PHPCRIHVRPGGLVRTVTVNB 


• 7009 


1 


626 


ARQLRNSW VDDFVAAPLI PLSQQI PTGNSLYES YYKQVDPAYTG 
RVGASEAALFLKKSGLSDIILGKIWD1ADPEGKGFLDKQGFYVA 
LRLVACAQSGHEVTLSNLNLSMP P P KFHDTSS PLMVTP PSAEAH 
WAVRVEE KAKPDG I FESLLP INGLLSGDKVKP VLMNS KL PLDVL 
GRVWDLSDIDKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPETLLS PLCPLLGGGTAMSGGEQKPERYYVGVDVGT 
GSVRAALVDQSGVLLAFADQPIKNWEPQFNHHEQSSEDINAACC 
VVTKKVVQGrDLNQIRGLGFDATCSLVVLDKQFHPLPVNQEGDS 
HRNVIMWLDHRAVSQVMRINETKHSVLQYVGG 


7011 


3 


994 


RIQTLPNQNQSQTQPUiKTPPAVLQPIAPQTTFGVQTQPQPQSL " 
LQAQISAASITPLLQTQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRLDPPSRFSGRNDRGDQVPNRKDDRSRERERERRRSRERSPQ 
RKRSRERSPRRBRERSPRRVRRVVPRYTVQPSKFSLDCPSCDMM 
BLRRR YQNLYI PSDFFD AQFXWVDAPPLSRP FQLGNYCNFYVMH 
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T Predicted 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to fir3t 
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amino acid 
sequence 


Amino acid segment: containing signal peptide 
{A»Alanine, C-Cyateine. D=Aspartic Acid, E=* 
Glutamic Acid, F« Phenyl a la nine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparag.ine , 
P= Proline, Q=Glut amine, RsArginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan / Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








REVESLEK^MAILDPPDADHLYSAKVMLWASPSMBDLYHKSCAL 
AEDPQ E LR DG FQHPAR L VKFL VGMKGKDEAMA I GGHWS PS LDGP 
DPEKDPSVLIKT\AIRCCKALTG 


7012 


1 


_1 


RRAGSVKRG^LFGPTERQSERPLRPSAARRPBMLSGKKAAAA 

AAAAAAAATGTEAGPGTAGGSENGSBVAAQPAGLSGPABVGPGA 

VGERTPRKKEPPRASPPGGLABPPGSAGPQAGPTWPGSATPME 

TGIAETPBG\RRTSRRK*AKVEYREMDESLANLSEDEYYSEEER 

NAKAEKEKKLP PP P PQAPPEEENE S BPEB PSG VEGAAFQSRLPH 

DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 

TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 

PTKXTGKV I HGSGVSGLAAAROLQSFGMDVTLLEARDRVGGRV 

ATFRKGNYVADLGAM WTGLGGN P MAWS KQVNMELAKI KQKCP 

LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVLNNKP 

VSLGQALE WTQLQE KH VKDEQ I EHWKKXVKTQEELKELLNKNV 

NLKEKI KELHQQYKEAS EVKP PRDITAEFLVKSKHRDLTALCKE 

YDELAETQGKIjEEKLQELEANPPSDVYLSSRDRQItiDWHFAKLE 

FANATPLSTLS LKHWDQDDDFB FTGSHLTVRNGYSCVPVALAEG 

LDIKLNTAVRQVRYTASGCEVIAVNTRSTSQTFXYKCDAVLCTL 

PLGVLKOX5PPAVQFVPPLPEWKTSAVQRMGFGNLNKVVLCFDRV 

F WDPS VNLFGHVGSTTASRGEL FLFWNLYKAPI LLALVAGEAAG 

IMEN1SDDVI VGRCLAILKGI FG3 SAVPQPKETWSRWRADPWA 

RGSYSYVAAG3SGNDYDLMAQPITPGPSIPGAPQPIPRLFFAGE 

HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQAT^GVP 

AQQSPSM 


7013 


1 


2661 


RRAGSVKRGEARLFGP?ERQSERPIJ*PSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGSEVAAQPAGLSGPAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPOSATPME 
TG I AETPEG \RRTSRRKRAKVEYREMDESLANLS EDEYYS EEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQLEAPYNSDT VLVHRVHS YLERHGL I NFG1 YKRI K PL 
PTKKTGKVI I IGSGVSGIAAARQLQSFGMDVTLLEARDRVGGRV 
AXFRKGNYVADLGAMWTGLGGNPMAWSKQUKMELAKI KQKCP 
LYEANGQAVPKEKDEMVEOEFNRLLEATSYLSHQLDFNVLNNKP 
VS LGQALEWI QLQEKHVKDEQIEH WKKIVKTQEEL KELLNKM V 
NLKEKI KELHQQYKEAS EVKP PRDITAEFLVKSKHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTLS LKHWDQDDDFEFTGSHLTVRNG YS C VPVALAEG 
LDI KLNTAVRQVRYTASGCE VI AVNTRSTSQTFI YKCDAVLCTL 
PLG VLKQQPPAVQ F VP PLPE WKTS AVQRMGFGNLNKWLCFDRV 
FWDPS VNLFGHVGSTTASRGELFLFWNLYKAPI LLALVAGEAAG 
IMENI SDDVIVGRCLAILKGI FGSSAVPQPKETWSRWRADPWA 
RGS YS YVAAGSS GNDYDLMAQPI TPGPS I P GAP QP I PRLFFAGE 

HTIRNYPATVHGALLSGLREAGRLADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7014 


3 


3950 


iH? ts ViiUKXKILATLEDGWLEGS LKGRTGI FPYRFVKLCPDTRVE 
ETMALPQEGSLARIPETSLDCLENTLGVEEQRHETSDHEAEEPD 
CI I SEAPTSPLGHLTS E YDTDRNS YQDBDTAGGPPRS P GVEWEM 
PLATDS PTSDPTE WNGISS QPQ VPFHPNLQ KSQ YYSTVGGSHP 
HSEQY PDLLPLEARTRDYAS LP PKRM YS QLKTLQKP VLPL YRGS 
SVSASRWKPRQSS PQLHNLAS YTKKHHTSS VYS ISERLEMKPG 
PQAQGLVMEAATHSQGDGSTDLDSKLTQQLI EFE KS LAGPGTEP 
DKILRHFSIMDFNS BKDIVRGSSKLITEQELPERRKALRP PPP2 
PCTPVSTSPHLLVDQNUCPAPPLVVRPSRPAPLPPSAQQRTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTS PCPLVLVRIEJEME 
RDLDMYSRAQEELNLMLEBKQDESSRAETLEDLKFCESNIE5LK ' 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, OCysteine, D=»Aspartic Acid, B= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L*Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v= Valine, 
WaTryptophan, Y*Tyrosine, X -Unknown, *=Stop 
Codon, /cpossible nucleotide deletion, 
\=poesible nucleotide insertion) 








MELQQLREMTLLSSQSSSLVAPSGSVSAENPEQRMuEKRAKVIE 
ELLQT ERD Y I RDLEMC IEIRIM VPMQQAQVPN I DFEGLFGNMQMV 
I KYS KQLLAALEISDAVGPVFLGHRDEIiBGTYKI YCQNHDBAIA 
LLE I YEKDEXI QKH LQ DS LADLKS LYNEWGCTNYINLGS FLI KP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVIAVKEINVNINE 
YKRRKDLVLKYRKGDEDSLMEKISKLN HIS I IKKSNRVSSHLKH 
LTGPAPQIKDEVFBETEKNFRMQERLIKSPIRDLSLYLQHIRES 

TERLVISPLNQIiLSMFTGPHKLVQKRFDKLLDFYNCTERAEKLK 
DKKTLEELQSARNNYBALNAQLLDEIiPKFHQYAQGLFTNCVHGY 
AEAHCDFVKQALEQLKPLLSLLKVAGREGNLIAIFHEEHSRVLQ 
QLQVFTFFPESLPATKKPPERKTIDRQSARKPLLGLPSYMLQSE 
ELRASLLARYPPE KLFQAERNFNAAQDLDVSLLEGDLVGVIKKK 
DPNGSQNRWLI DNGVTKGFVYSSFLKPYN PRRSHSDAS VG5HSS 
TESEHGSSS PRFPRQNSGSTLTFNPN\ S \MAVS FTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQPTATPRSYRNFRHPBIVGYSVPGRNGQSQDLVKG 
CARTAQAPEDRSTEPDGSEAEGNQVYFAVYTFJCARNPNEL3VSA 


7015 


1842 


513 


RQAWHE \ VAAP S WRG ARLVQS VLRVWQVG PHVARB RV1 P FSSLI* 
GFQRRCVSCVAGSAFSGPRLASASRSNGQGSALDHFLGFSOPDS 
SVTPCVPAVSMNRDEQDVLLViiHPDMPENSRVLRVVljLGAPNAG 
KSTLSNQLLGRKVFPVSRKVHTTRCQALGVITEiffiTOVILIiDTP 
GI I S PGKQKRHHLELSLLEDPWKS MES ADLVWLVDVS D KWTRN 
QLSPQLLRCLTKYSQIPSVXVmKVDCLKQKSVLLEIiTAALTEG 
WNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKB1FMLSALSQEDVKTLKQYLLTQAQPGPWEYHSAVLTSQTPE 
EI CANIIREKLLEHLPQEVPYNVQQKTAVWEEGPGGELVIQQKL 
LVPKESYVKLLIGPKGHVISQIAQEAGHDLMDIFLCDVDIRLSV 
KLLK 


7016 


167 


2513 


IliNAPKPPPPRDSVEAVAAKRDTGGGSWGTGMDVSGQETDWRST 
AFRQKLVSQI EDAMRKAG VAHS KS S KDMES HVFLKAKTRDEYT* S 
LVARLI IHFRD IHNKKSQASVSDPMNALQSLTGGPAAGAAGIGM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAAXARSSSS SS RRRYS SSSSSSNSKQ 
FOAQQSAMQQ\QFQA\WQCX3QQL\QQQQQC2QQHLIKXHHQNQQ 
QI QQOOOOLOR I AOLOLGGOOOOGOGOOOOOOOALOAOPP IOOP 
PMQQPQ P P PSQALPQQ U3QMHHTQHHQP PPQPQQP P VAQNQP SQ 
LPPQSQTQPL VS OAQAXiPGQML YTQPPLKFVRAPMWQQPPVQ P 
QVQQQQTAVQTAQAAQMVAPGVQVSQS S LPMLSSPSPGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\ QSPVTARTPQNFS VPS PGPLNTP VNPS S VMS PAGS SQAEEQQ Y 
LDKLKQLSKYIEPLRRMINKIDKNEDRKKDLSKWKSLLDILTDP 
SKRCPLKTLQKCEIALBKLKNDMAVPTPPPPPVPPTKQQYLCQP 
LLDAVLAN I RS P VFNHSLYRTFVPAMTA IHGPPI TAP WCTRKR 
RLEDDERQSIPSVLQGEVARLDPKFLVNLDPSHCSNNGTVHLIC 
KLDDKDLPSVPPLELSVPADYPAQSPLWIDRQWQYDANPFLQSV 
HRCMTSRI^IiQLPDKHS VTALLNTWAQS VHQACLSAA 


7017 


1 


1785 


INLGNTCYMNSVI*ALFMATDFRRQVLSLNI*NGCNSI*MKKLOHL 
PAFLAHTQRBA YAPR I FFEAS R P P WFT PRS QQDCSE YLRFLLDR 
LHEEEKIliKVQASHKPSEILECSETSLQEVASKAAVIiTETPRTS 
DGEKTL IE KMFGG KLRTHI RCLNCRSTS QKAEAFTDLS LAFWPS 
YSLEYMSCPDCSQSPSIQDGGLMQASVPGPSBEPWYNPTTAAF 
ICDSLVNEKTIGSPPKEFYCSENTSVPNESNKILVNKDVPQKPG 
GETTPSVTDLLNYFLAPEILTGDNQYYCENCASLQNABKTMQIT 
EEPE YLILTLXjR fs ydqkyhvrrkildnvslpl vlel p vkr its 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, • 
H=Histidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N^Asparagine, 
p=Proline, Q=Glut amine, RsArginine, 
S=Serine, T=Threonine, V*Valine, 
W-Tryptophan, Y«Tyrosine, X^UnJcnown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=*possible nucleotide insertion) 








FSSl^ESWSVDVDFTDLSENLAXKLKPSGTDEASCTKLVPYi7LS~' 
SWVHSGISSESGHYYSYARNITSTDSSYQMyHQSEALALASSQ 
SHLLGRDS PSAVFEQDLENKEMS KEWFLFNDS RVTFTS PQSVQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKE 
W4DAITKDNKL YLQEQ ELN ARARALQAASAS CS FRPNGFDDND P 
PGS CGPTGGGGGGGFNT VGRLVF 


7018 


484 


1066 


SLVFRGNTWSGEAGHHCSALFfJLAAYHQLFVGTERIRAPEl 1 FQ~" 

PSLlGEEQAGLAETLQYILDRYPKDVQEMLVQJIVFLTGGNTMYp 

GMKARMEKELLEMRPFRSSFQVQLASNPVLDAWYGARDWALNHL 

DDITEVWITRKEYEEKGGEYLKEHGASNIYVPIRLPKQASRSSDA 
QASSXGSAAGGGGAGEQA 


| 7019 


1048 


335 


APGGFLVTMVFPAPSPPWMLGCCSHBVTAGPPTLCKDMSALVAA 
RMRHI PIAPGSDWRDLPNIEVRLS DGTMARKLRYTHHDRKNGRS 
S SGALRG VCS CVEAGKACDPAARQ FNTL I PWCLPHTGNRHNHWA 
GIiYGRLEWDGFFSTTVTNPEPMGKQGRVLHPEQHRWSVRECAR 

SQGFPDTYRLFGNILDKHRQVGNAVPPPLAKAIGLEIKLCMLAK 
ARESASAK I KEEEAAKD 


7020 


1 


21*4 


FADSKRKSVLLDKIKNLQVALTSKQQSLETAMSFVARNTFKRVR -- 

NGFLMRKVAVFFSNTPTRASPQLREAVLKLSDAGITPLFLTRQE 

DRQLINALQ INNTAVGHALVLPAGRDLTDFLENVLTCHVCLDI C 

NIDPSCGFGSWRPSFRDRRAAGSDVDIDMAFILDSAETTTLFQF 

NEMKKYIAYLVRQLDMSPDPKASQHFARVAWQHAPSESVDNAS 

MPPVKVEFSLTDYGSKEKIiVDFLSRGMTQLQGTRAl»GSAIEYTT 

ENVFESAPNP RDLKI WLMLTGE VPEQQLEEAQRV1 LQAKCKGY 

FFWLGIGRKVNIKEVYTFASBPNDVFFKLVDKSTELNEEPLMR 

FGRLLPSFVS SENAFYLS PDIRKQCDWFQGDQPTKNLVKFGHKQ 

VNVPNNVTSSPTSNPVTTTKPVTTTK^VTTTTKPVTTTTKPVTI 

INQPSVKPAAAKPAPAKPVAAKPVATKTATVRPPVAVKPATAAK 

PVAAKPAAVR P PAAAAAKP VATKPEVPRPQAAKPAATKPATTKP 

MVKMSREVQVFEITENSAKLHWERPEPPGPYFYDLTVTSAHDQS 

IiVLKQNLTVTDRVIGGLLAGQTYHVAWCYLRSQVRATYHGSFS 

TKKSQPPPPQPARSASSSTINLMVSTEPLALTETDICXLPKDEG 

TCRDF I LKWYYDPNTKS CARFWYGGCGGNENKFGS QKECEKVCA 

PVLAKPGVISVMGT 


7021 


2 


338 


VNAVS FFPNG YAF ATGSDDATCR LFDLRADQELLLYSHDN 1 1 CG 
ITSVAFSKSGRLLLAGYDDFNCNVWDTIiKGDRAGVLAGHDNRVS 
CLGVTDDGMAVATGSWDS PLRIWN 


7022 


2 


856 


vyigsfwshpllipdnrklfeaeeqdlfrdiqslprnaaLrkln 

DLIKRARIAK\raAYlISSLKKEMPSVFGKDNKKKEnjVNNLAEIY 

grierehqispgdfpnlkrmqdqlqaqdfskfqplkskllewd 
dmlahdiaqlmvlvrqebsqrpiqmvkggafbgtlhgpfghgyg 
egagegiddaewvvardkpmydeifytlspvdgkitganakkbm 
vrsklpnsvi^kiwkladidkdgmldddefalanhlikvklegh 
elpnelpahllppskrkvae 


7023 


2 


748 


amvfggwpyvpqyrdirrtqWadgfstyvclvijjvanilrilf 

«r vjw r iw r UL» wy bAi M j i» i m LLML KLCTEVRvANELNARRRSF 
TAADS XDEEVKVAPRRS FLD FDPHH FWQWSS FSDYVQCVIiAFTG 
VAGYI TYLSIDSALFVETLGFLAVLTEAMLGVPOLYRNHRHQST 
EGMSIKMVLMWTSGDAFOAYFLLKGAPLQFSVCGLLQVLVDLA 
ILGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 


R3X3VTGWAgWMFGGGGVLSSGEQI^MPVKPERGI^PSbGWLV 
SSRRGSPGTVLGLPFWLLTPVLVSRSIRSMLLLTRSPTAWHRLS 
QI^PVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 
RliLITGLFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHI^DHRGRAilCKADFRGQWVLM YFGFTHCPD I C PDELEKLVQV 
VR0LEAEPGLPPVQPVFITVDP3RDDVEAMARYVQDFHPRLLGL 
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3EQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

Amino a^frl 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C«Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, KaLysine, 
L=Leucine, M=Methionine, N«Asparagine , 
Paproline, Q=Glutamine, R=Arginine, 
S*Serine, T=*Threonine, V^Valine, 
M^Tryptophan, Y-Tyrooine, X=Unknown, *»stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TGSTKQVAQASHS YRVYYNAGP KDEDQDY I VDHS I AIYLLNPDG 
LFTD YYGRS RS AEQ I SDS VRRHMAAFRS VLS 


7025 
7026 


232 


832 


ernspigwnenl*k\hsldclcfrgdwegntqp^tlqdnqbecf 
kqvirtcekrptfnqhtvfnlhqrlntgdklnefkelgkap1sg 
sdhtqhqlihtsekfcgdkecgntflpdseviqyqtvhtvkkty 
eckecgksfslrssltghkrihtgekpfkckdcgkafrfhsqls 
vhkrihtgeksyeckecgkafscg 




328 


1146 


~N PN PS IGD I KB I KKAAKSMLD PAriKSHFH P VTPSLVFLCFI FDG 
LHQAULSVGVSKRSNTWGNENEERGTPYASRFKDMPNFIALBK 
SS VIiRHCCDLLIGVAAGS SDKI CTSSLQ VQRRFKAMMAS IGRLS 
HGESADLLISCNAESAIGWISSRPWVGBLMFTFLFGDFESPLHK 
LRXSS * LPRKHR*QPINAVRMFLDQCMDGS IALRAI VSEIP VPE 
BKKNNG*KGIGEIF*VWGCTPPHYWGAVTTNVPKLSNSGKLLG 
QDEQPHIFG 


7027 


43 


9S4 


GRRbQQQQRPEDAEDGAEGGGXRGEAGWEGGYPEIVKENKiFEH 
YYQELKIVPEGEWGQFMDALREPLPATLRITGVKSHAKEILHCL 
KNKYFK£LEDLEMDGOKVEVPQPLSWYPEEI»AWHTNLSRjaLRK 
SPHLEKFHQFLVSETESGITISRQEAVSMIPPLLLNVRPHHKILD 
MCAAPGSKTTQLIEMLH^DMNVPFPEGFVIANDVDNKRCYLLVH 
QAKPJjSSPCIMVVNHDASS IPRLQIDVDGRKE tliFYDRlLCDVP 
CSGDGTMRKN I DVWKKWTT LNSIiQIiHGIjQLRIATRGAEQL 


7028 


189 


608 


SRP PPEPEPGTMVBKGSDS S SEKGGVPGTPSTQStiGSRNK I RNS 

KKMQS W YSMLSPTYKQRNE DFRKL FS KLPEAERLI VDYS CALQR 

EILLQGRLYLSENWICFYSNIFRWETTISIQLKEVTCLKKEKTA 
KLIPNAIQ 


7029 


1343 


40 


VLBSNTEAKQATGTSSKLRHGTGQEKGREGPRCPSGLAQLRLWG" 
/ PCPHAGRETGPRASAPI PGS *GHGm03*RKJDGRGERSEG PSAL 
SPHSPSLLNMQQAPTHVGPGMGSQRPRSSWPEQVGVGSQLSRE 
RWRA*RSLPGAAASERTEMTKERSP/RPCCGYDSSWWFTQPGKK 
TRKRNSRRNTMVSRGGGCLI*YPLQSIMPE*QLR*GAHASPPTQG 
R* GKGGPRS PLTKASGTTHI PTPFFGS I P/RPTRDSGPGTDNS \ 
AAPGQKRGHREA *QGPEPV/ WGRVTTHLQGPAG * TKPLGS \RNW 
VPGPAEGEQGEGAGLEGRP * PLKGCRSTLTFSPQLSI PMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPL 

PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 


7030 


2 


521 


FVCFSAPGSGQ6GKRRVKMELSAVGERVFAAEALLKRRIRKGRM 
EYLVKWKGWSQKYS^WEPEENIIOARLIAAFEERERE^4EI,YGPK 
KRGPKPKTFLLKAOAKAKAKTYEFRSDSARGIRIPYPGRSPQDL 
ASTSRAREGLRN \ RVCPRQRAAPAPAAP \ PRRGPSG PGPRPG * G 
PGLHFPGPGGPSKHGFVPASEOHQHQQHLPRRGPSGPGPRPG 


7031 


960 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSIiSPGCGHSPGPG 
/ CKPS /RHCDELHEGPSRTAALPCGKPQPKHGVEECG/PCPCLA 
PRRLTEPPALTVSPVGRAAPSGAI**PSGRACSACSHRLAPEAAL 

*?t\nf\tr Rf o uvsoMU^AdubrAAD U tr tr\iUi^ & \j crittTV PS PARS V P P 

LGAQARAAP PRLWC PRALVSG* EASPEAVSVAAGPPVPGPTPST 
SGSTASHSRRGC* S PR* TPAP PRRJDHGRS AAFE VLTAAASAQP C 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL 


7032 


1393 


2104 


rrpgrtepvepppvpppprasnsksrcr*rnlhlapl*qspi7rk 
srqigtsslpfgrsagerprpaatfclsrggsspvfl*pssssl 
epwmkrqfgrlhsi.pwkswqkmnsflltpkldtslmsgwryrqr 
lprlhtflkkslqmaselapplptpaplasslppppgpppllpv 
pta*lsrsgilvppnsgfslsc\plgdh*gssgevrgscgsppp 
hhcwlpppp*lllppr 


7033 J 


689 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA' 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide ™ 
(A^Alanine, C=Cysteine, DaAspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*»Asparagine, 
PaProline, Q-Glut amine, R=>Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /»possible nucleotide deletion, 
\=po3sible nucleotide insertion) 








lmmpsscpwrtgaixspspagsralgrct'ssVgpgsrWltrtssp 
gcatrtwrtmrmbprplrsrmgesapgipaelpsaapsgpsaps 
aaapsapttpaaagpntl*srrtaewcwppscsccwgwc*swsa 
wdwrrpplqvs papssscras ccwcles it* s 3starsratgas 
ssstcptsrsdrgaawtp\spmgapllpcsvplisreealqdpr 
npsp*gvcsgssghaglalgkppvacsvp 


7034 


92 


1942 


EDTSSMPFRLLI PLGLLCALLPQHHGAPGPDGSAPDPAHYRSRV 
KAMF YHAYDS YLENAFP FDELRPLTCDGHDTWGS FSLTLI DALD 
TbL\TLFYFQILGNVSEFQRWEVLQDSVDPDlDVNASVFETNI 
RVVGGLLSAHLLSKKAGVEVEAGWPCSGPLLRhlAEEAARKLLPA 
FQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEF7ATLSSL 
TGDP VFED VARVALMRLWES RSD I G LVGNHIDVLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKI.MAMFLEVNKAIRNYTRFDDW 
YLWVQMYKGTVSMPVFQSLBAYWPGLQSL1GDIDNAMRTFLNYY 
TVWKQFGGLPEFYNIFQGYTVEKREGYPLRPELI BSAMYLYRAT 
GDPTLLELGRDAVES I E K I S KVECG FATI KDLRDHKLDNRM ES F 
FLAETVKYLYLLFDPTNFIHNNGSTPDAVITPYGECILGAGGYT 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTSKLALLGQVFLDSS * PLDNFFI FIFLRLN YNKLLLAI IKK 
K 


7035 


92 


1942 


FJ3TSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSY1»ENAFPFDELRPLTCDGHDTWGSFSLTLIDAIjD 
TLI»\TLFYFQILGNVSEF0RWEVLQDSVDFDIDVNASVFETWI 
RWGGLLS AHLLS KKAGVBVEAGWP CS GPLLRMAEEAARKLLPA 
FQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSL 
TCDPVFED VARVALMRLWESRS DI GLVGNH ID VLTGKWVAQDAG 
j.\mvj vua irtl iiV^toAXIjJjQDKKLriAMr LEYNKAIRNYTRFDDW- 
YLWVOMYKGTVSMPVFQSLEAYWPGLQSLIGDIDNAMRTFIiNYY 
TVWKQFGGLPEFYNIPQGYTVEKREGYPliRPBIiIESAMYLYRAT 

FLAETVKYLYLL FDPTNFIHNNGS TFDAVITPYGECI LGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVBDLMREPYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTSKLALLGQVFLDSS * PLDNFFI F I FLRLNYNKLLLAI I KK 
K 


7036 


442 


7S1 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


7S1 ■ 


CLAPLPS CFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW* ERXAGCSQPC/ P AQQHHGRP PGVS PLPRDPHPTTLftPLP P 
PPPPPPPPPRRPPRNRRPG 


7038 


15S 


891 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEE I IL 
QYNKLLEKSDLHSVLAQKLQAEKHDVPNRHEISPGHDGTWNDNQ 
LQEMAQLRI KHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKI AECLQTI SDLETECLDLRTKLCDLERANQTLKDEYDA 
LQI TFTALEGKLRKTTEEWQELVTR WMAEKAQEANRLNARE *KR 
LQEAAS PAAERACRS SKGTSTSRTG 


7039 


155 


B91 


GAGAASDMS SGLRAAD FPRWKRH I S EQLR RRDRLQRQ A FEE I IE 
QYNKLLE KS DLHS VLAQKLQAEKHDVPNRHE I SPGHDGTWNDNQ 
LQEMAQLRI KHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQT I SDLETECLDLRTKLCDLERANQTLKDEYDA 
LQIT FTALEGKLRKTTEENQELVTR WMAE KAQEANRLNARE * KR 
LQ E AAS PAAERACRS S KGTSTS RTG 


7040 * 


34 


789 


KI TPPRR PHRCSSGHGS DNSS VLS GELP PAMG KTAL F YUSGGSS 
GYESVWRDSEATGSASSAQDSTSENSSSVGGRCRSLKTPKKRSN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residua of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine„ R=Arginine, 
SoSerine, T=Threonine, V=Valine, 
WaTryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








PGSQRRRLIPALSLDTSSPVRKPPNSTGVRWVDGPl*RSSPRGLG 
EPFBIKVYBIDDVERLQRRRGGASKEAMCFNAKLK1LEHRQQRI 
AEVRAKYEWI*M KELEATKQY LMLDPNKWLS EFDLEQVWELDS LE 
YLEALECVTERLESRVNPCKAHIiMMITCFDlT 


7041 


1 


567 


SGRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDSWLHTSEL 
NDGYD WGRLNLQS VT3QSS LDD FLATAEIAGTE FVAEKLN I K F V 
PAEARTGLIi S FEES QR I KKIiHEENKQ FLC I PRRPN WNQNTTP EE 
LKQAEKDWFIjEWRRQLXvrLEEEQKLILTPFERNLDFWRQLWRV 
IERSDIWQIVDA 


7042 


7 


34S 


PIHMAAAAbRADI \ ISPLFPHIQGYLLLSASHG\ATSLHTKGAL 
PLETVTMYTVIPKSKYVLVKPDTQYPYSENLDEFKRLAENSASN 
DDLLMAE VAI SD YGDKI/TLELRE KY 


7043 


2 


2170 


ARGMAARDSDSEBDLVSYGTGLEPLEEGERPKKPlPLQDOTVRD " 
EKGRYKRFHGAFSGGFSAGYFNTVGSKEGWTPSTFVSSRQNRAD 
KSVLGPEDFMDEBDLSEFGIAPKAIVTTDDFASKTKDRIREKAR 
OLAAATAP I PGATLLDDX.ITPAKLS VGFBLLRKMGWKBGQGVGP 
RVKRRPRRQKPDPGVKI YGCALP PGSS EG S EGEDDD YLPDNVT F 
APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGEHFNLF3GG 
S ERAGDLGEIGLNKGRKLGISGQAFG VGALEE3DDD 1 YATETLS 
KYDTVLKDEEPQDGLYGWTAPRQYXNQKESEKDIiRYVGKILDGF 
SLASKPLSSKKIYPPPELPRDYRPVHYFRPMVAATSENSHLLQV 
LSESAGKATPDPGTHSKHQLNASKRAELLGETPIQGSATSVLEF 
LSOKDKBRIKSMKQATDLKAAQLKARSLAQNAQSSRAQPSPAAA 
AGHCS WNMALGGGTATLKASNFKP FAKDPE KQ KRYDE FLVHMKQ 
GQKDALERCLDPSMTE WERGRERDE FARAALL YASSHSTLS SR F 
THAXE EDDS OQVEVPRDQENDVGD KQ SA VKM KMFGKLTR DTFE W 
HPDKLLFQ/RLVGLPRVKRDKYSVFNFLTLPETASLPTTQASSE 
KVSQHRGPDKSRKPSRWDTSKHEKKEDSISEFLRLARSKAEPPK 
QQSS PLVNKEEEHAPEI^AN 


7044 


276 


734 


EVYLTDEFAXGRKVADliYELVQYAGNIIPRLYLLITVGWYWs - 
FPQSRKDILKDLVEMCRGVQHPLRGLFLRNYLLQCTRNILPDEG 
EPTDEETTGDISDSMDFVLLNFAEMNKLWVRMQHQGHSRDREKR 
ERERQELRILVGrNLVRLSQV 




3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTFCP 
KANQWEKTD I EGTLFVYRRSASPYHGFTIVNRLNMHNLVEPVNK 
DLEFQLHEPFLIjYRNASLSIYSIWFYDKNDCHRIAKLMADWEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7046 


3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTPCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTrVNRLNMHNLVEPVNK 
DLEFQLHEPFLLYRNASLSIYSIWFYDKNDCHRIAKLMADVVEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7047 


103 


466 


QMK1EKCGWS EGLTS I KGNCHNFYTAISKDVTYKELKNLLNSKN 
IMLIDVREIWSILEYQKIPESINVPLDEVGEALQMNPRDFKEKY 
NEVKPS KSDS / 1 VFS YLAGVRSKKRLDTAISLGFHS YYER 


7048 


92 


627 


FFCLTL LS S WD YRHHATRR V f Q qpVPTMPn cm vn* c»e e-pp p r> * »w — 
WKDLAI^TYKQRAEOTQEELREFQEGSREYEAELETQLQQIETRN 

rdlls ennrlrmeleti kb kfe vqhs egyrqi saleddlaqf ka 
ikdqlqkyireleqanddi.erakratdhglsktfe\qrln\qai 

EKKW 


7049 
7050 


393 
393 


938 
938 


KRTGSAS YGG^f fGJL/SGPATXASVAGRCSSVGKI PARRCYEDEL 
VPVFEAVGRIYELRLM^FDGKNRGYAFVMYCHKHEAKRAVRBL 
NNYBIRPGRIJ^VCCSVDNCRLFIGGIPKMKKREEILBEIAKVT 
EGVLDVI VYAS AADKMKNRGLRLRG VREP PRG CHWLGRKLI AWX 
ASSLWG 

KRTGSAS YGGP PPGLGG PATXAS VAGRCSS VGKIPARRCYEDEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
s© que nc e 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, 2= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=lsoleucine, K»Lysine, 
L^Iieucine, M= Methionine, N«=Asparagine, 
P=Proline, Q=Glutaraine , R=Arginine, 
SoSerine, ^Threonine, V=»Valine, 
WaTryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*po3sible nucleotide insertion) 








VPVFEAVGRlYELRmMDFDGKNRGYAFVMYCHKHEAKRAVRBI, 
NNYEIRPGRLLGVCCSVDNCRLFIGG1PKMKKREEILBEIAKVT 
E G VLDV I VYAS AAD KMKNRG L R LRG VRE P PRG CHWLGRKL I AWX 
1 ASSLWG 


70S! 


119 


816 


KraWLAEICDNAKKGREYALLGNYDSSMVYYQGVMbQtQRHCQS 
VRDPAIKGKWQQVRQELLBBYEQVKSrVGTLESFKIDKPPDFPV 
SCQDEPFRDPAVWPPPVPAEHRAPPQ1RR/RQSUSKTSEERNGR 
SRS PGTCRPST\PISKSEKPSTSRDKDYRARGRDDKGRKNMQDG 
AS DGEMPKFDGAG YDKDLVEALERD IVS RN P S IHWDD I ADLE EA 
KXLLREAGVLPMWM 


7052 


467 


715 


SCPGRGKMSWbLNPBEMTSRDYYFDSYAHFGIHEEMLKDEWnT" 
I^RNSWHNKHVFKDKVVLDVGSGTGIIJSMFAAROGPRR 


7053 


467 


715 


SCPGRGKMSIUjLNPEEMTSRDYYFDSYaHFGIHEBMLKDEVRTL 
T YRNSMYHN KHVFKD KWLDVGSGTGI Ii5 M FAARQGPRR 


7054 


1 


1036 


GTSQRSRElHARRRSAGAEPTARLPWPAALEEWPfeCPCEPLGPG 
RRCRWDJ^EYDEKIJU^RQAHLNPFNKQSGPRQHEQGPGEEVPD 

vtpeealpelppgepefrcpervmdlgi^sedhfsrpvglflasd 

VQQLRC^IEFICKQVILELPEQSEKQiajAVVRLIHLRLKLQELKD 
| PNEDEPNIRVLLEHRFYKEKSKSVKQTCDKCNTIlWGLrQTWYT 
| GTGCYYRCHSKCLNLISKPCVSSKVSHQAEYELNICPETGLDSQ 
DYRCAECRAPl/CS/DGWPSEAROCDYTGOYyCSHCHl9I7DLAV 
IPARWHNWDFEPRKVSRCSMRYLALMVSRPVLRLREIN 


7055 
70^^— 


2 


527 


DSRRVSWRSWUlNfE/WdKHLCLFIWLSMNVLLFWKTFLLYNQGp-- 
EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLIAYLRG 
S QKVPSRRTRRLLDKSRTFH ITCG ATI CI FSGVHVAAHLVNALN 
FS VNYS EDF VE LNAAR YRD E D PR KLL FTTVP GLTGVCMEWL FL 
M 




2 


527 


DSRR VS WRS WLAWE / WGKHLCLF I ^JjSMNVLIJ? WKTFLL YNQG P 

EYHYLHQMLG/ALCLSRASASVI.NLWCSLILLPMCRTLLAYLRG 

SQKVPSRRTRRLLDKSRTFHITCGATICIFSGVHVAAHLVNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEVVIjFL 
M 


7057 
7058 


1368 


431 [ 

r | 


giylhvnekiprptcigdrqendkenlnlenhrdqellhascqa 
sgevpsqaslrgfftedepgcfgegenlpealqniqdegtgeql 
spqerisekqlgqhlpnphsgemstmwleekretsqkgqprapm 
aqklptcrecgktfyrnsqlifhqrthtgetyfqctickkaflr 

SSDFVKHQRTHTGEKPCKCDYCGKGFSDFSGLRHHBKrHTGEKP 
YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 
^Q RSHL GKKPFQ*PVTKLSFPISISQPSHKNTQLHQEEI,CLR 




1 


469 | 


FSGFGAVPDAIiGCRMSDLRITEAFLYMDyLCFRALCCKGPPPAR 
PE YDLVC I GLTGSGKTSLLS KLC5 ES PDNWSTTGFS IKAVPFQ 

ITAILNVKELGGADN1RKYWSRYYQGSQGVIFVLDSASSBDDLEA 
ARN*"SCTQLLQHPQI»CTLPFLI LA 


7059 
7060 


1 


1178 

1 


WPA FPRQ PAAAAMDALLG TGPRRARGCLGAAG PT^wpanDTDS — 
APWARFSAWLECVCWTFDLELGQALELVYPNDFRLTDKEKSSI 
CYLSFPDSHSGCI^DTQFSFRmQCGGQRSPWHADDRHYNSRAP 
VAIjQREPAHYFG YVYTRQVKDSS VKRG YFQKS LVLVS RLPFVRL 
FO^LLSLIAPEYFDKLAPCLBAVCSEIDQWPAPAPGQTLNLPVM 
GVWQVR I PSRVDKS E S S PP KQFDQBNLLPAP VVLAS VHELDLF 
RCFRP VLTHMQTL WE LMLLG E PLLVLAPS PD VSSEMVLALT3 CL 
QPLRFCCDFRPYFTIHDSEFKEFTTRTQAPPNWLGVTNPFFIK 
rLQHWPHILRVGEPKMSGDLPKQVKLKKPFKV*RPWDTKP 




90 


1670 


5 VNLP PSIi WP WEEAMDSTKS E PLKGS PEAEDGNI E YKKLVNP SQ 
SfRFEHLVTQMKWRLQEGRGEAVYQIGVEDNGLLVGLAEEEMRAS 



592 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 

NO; 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 
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Amino acid segment containing signal peptide"" 
(A»Alanine, OCysteine, DsAspartic Acid, Ea 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
HoHistidine, l*Isoleucine, K= Lysine, 
L=Leucine, MsMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S«Serine, T=Threonine, V-Valine, 
W*=Tryptophan, YaTyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKTLHRMAEKVGADI TVLRERKVDYDSDMPRKJTE VLVRKVPDN' 
QQFLDLRVAVLGNVDSGKSTLLGVLTQGBLDNGRGRARLNLFRH 
LH E I QS GRTS S I SFE I LGFNS KGEVHG INGTQWGQTLRMGW * * + 
RT * D3GRVWRLFE I V* MNALRGL *TSS APLRKSMGNQLN* I KNG 
VKIKROGHPGNGU5PGNSEGVGRAGRRH*GPMAT/^)VVKry , <in<?T? 
TAEEICESSS KMITFI DLAGHHKYLHTTI FGLTSYCPDCALLLV 
S ANTG I AGTTREHLGLALALKV PFFI WS KI DLCAKTTVERTVR 
QLERVLKQPGCHKVPMLVTSBDDAVTAAQQFAQSPNVTPIFTLS 
SVSGESLDLLKVFIiNILPPLTNSKEQEELMQQLTEFQVDEIYTV 
P EVGTWGGTLSR* I DLLATLPTQPS P I YSKTSWPKGGDPGI 


7061 


364 


710 


iSMDRDT/lDDPriOUMnDPTTr PVOPTaOT.B OOi~ , 'Cf'"»\//*M?*r»\ r»nni? 
mwur 3 *r iwr rLJjr vnUrCiJ, i ijllll.fc'&lMivlJXvr Kvf w X V /vjrKfc* 

AIARLRELCCQWLQPEAHSKEQMLEMLVliEQFLGTIiPPEIQAWV 
RGQRPGSPEEAAALVEGLQHDP*ARMPSPLGPPCLPVMDPETTL 
EEPETARLRFRGFCYQEVAGPREALARLRELCCQWLQPEAHSKE 
QMLEMLVLEQ FLGTI»P P E I QAWVRGQRPGS PEEAAAL VEGLQHD 


7062 


71 


744 


AKAGTNLERLHWLS YFFCI PKHKLKS SQKDKVRQFKACTQAGER 
TAIYCLTQNEWRLDEATDSFFQNPDSLHRESMRNAVDKKKLERL 
YGR YKDPQDENKIGVDGIQQFCDDLSLDPAS isvlviawkfraa 
TQCEPSRKBFLDGMTE LGCDSMEKLKALLPRLEQELKDTAKFKD 

FYOFTFTFAXNPGDKGIjDIj *MA(SAYWJCT.VT. QfiPlJWT. VT .l»TNrri?T 

MEHH 


7063 


2 


562 


LRTVPDLPGRRFRAMRTGQRR * PELPPDMNSLEQAEDiKAFERR 
LTEYIHCLQPATGRWRMLLIWSVCTATGAWNWLIDPETQICVSF 
FTSLWiraPFFTISCITLIGLFFAGIHKRVVAPSITAARCRTVLA 

LiiiriOUJUiuiUJi JblVJcr Kirn vy UOO l_i.Lv L ivjijls. J.i\JL' JjJK XolJ 1 AK.S 

HKGFLLRLDM 


7064 


300 


884 


RDTGS DPS STRRLCST CCTGH * PAE P I AS PHPS RGTCP PAS SAS 
SRRTGCWTCPPESGHAQARRSRRASASRWGARGAVRSAVAARGC 
SSRAGRWLETPGRRRGPPACAAAAGRLRGPAP*AAPPTASVPAR 
CRCPAARTGAPAAATWLRRRLSGLRAPALGRRRSPGPSPKSAAP 
PLLTPLGAGRAGGSRANS 


706£ 


1 


555 


ATTTHSAHRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGKQNNIADPEEL 
FTKIiERIGKGS FGE VFKG IDNRTQQWAI KI I DLEE AEDE IEDI 
QQE ITVLSQCDSS YVTKYYGSY LKGSKLWI 1MB YLGGGSALDLL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASLRSNVRAATMMQICDT 
YNQKHS LFNAMNRF IGAVNNMDQTVMV PSLLRDVPIADPGLDND 
VGVEVGGSGGCLEERTPP 


7067 

• 


152 

• 


973 


KENITMATEIGSPPRFFHMPRFQHQAPRQLFYKRPDFAQQQAMQ 
QLTFDGKRMRKAVNRKT IDYNPS VI KYLENRI WQ RDQRDMRAI Q 
PDAG Y YNDli V P P I GMLNN PMNA VTTKFVRTS TNX VTCCP VFWRW 
TPEGRRLVTGAS SGBFTLWNGLTFNFET ILQAHDS P VRAMTWSH 
NDMVMLTADHGGYVKYKQSNMNNVKMFQAHKEAIREARFIHNIP 
FS VVP I VMVKL FSKCI LGAEMHGLCQFLGNFLHPINTI FFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKBYVLLLFIJ^LCSAKPFFSPSHIALKNMMLKDMEDTJDDDDD 
DDDDDDDDDDEDNS LFPTRE PRSHFF PFDliPP MCP FGCQCYSRV 
VHCSDLGLTS VPTNI PFDTRMLDLQNNKIKB IKENDFKGLTSLY 
GLILNNNKLTKIHPKAFLTTKKLRRLYLSHNQLSE I PLNLPKSL 
AELRIHENKVKKIQKDTFKKK 


7069 


l±4l 


1765 


FRDHRRYFYVNEQSGESQWEFPDGBEEEEESQAQBNRDETLAKQ 
TLKDKTGTDSNSTESSETSTGSLCKES FSGQVSSSSLMPLT P FW 
TLLQSNVPVLQPPLPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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Ammo acid segment containing signal peptide 
{A=Alanine, C«Cysteine, D«Aspartic Acid, B= 
Glutamic Acid, F-Phenyl alanine, G-Glycine, 
HoHistidine, I=»Isoleucine, KeLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








" EKTKKGRKDKAKKSKTKMPSLVKKWQSIQRKJjDKKDNSSSSKED 
RV5 TAQKR. I E EWKQQQLVSGMABRNANFB A 


7070 


1 


547 


DGTMEDSBAVQRATALIEQRLAQEEENEKLRGDARQKLPMDLLV 
IiBDEKHHGAQS AALQKVKGQER VRKTSLDLRRB I ID VGG IQNL I 
ELRKKRKQKXRDALAASHKPPPEPBBITGPVDBSTPLKAAVEGK 
M KV I EKFLADGG SADTCD Q FRRTALHRASUEGHMEILB KLLD NG 
ATVDPQ 


7071 


2 


921 


ARGTLRALETAKKVGKVGANGQKAAGPSAbSVTENKIGSPPKTP " 

VSNVAATSAG PS NVGTELNS VPQKSS PFLTRVPAYPPHS ENI QY 

FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 

PESSLPPASMPYADHVSTFSPRDRMNSSPYQPPPPQPYGPVPPV 

PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 

SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 

IRRKP DQWAQYHTQ KAPLVS STLP VATQS PT PPSTLNRGEGS 


7072 


2 


921 


ARGTLRAXiETAKKVGKVGANGQKAAG PS ADS VTEN KIGS PPXTP 
VSNVAATS AGPSNVGTELNS VPQKS S PFLTR V PAYP PHS EN I QY 
FQDPRTQIP FEVPQYPQTG YYPPP PTVPAGVAPCVPRFVRSNNV 
PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAP VYDSRR I WRP PM YQRDDI IRSNSLPPMDVMHSS VYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQ YHTQKAPLVS STLPVATQS PTPPS TLNRG EGS 


7073 


50 


504 


LAHG3FGVSDFPAPAAAPAHTLTSFSGSLSPQFRKPLGRAPAMP 
LVRYRKWI LGYRCVGKTS LAHQFVEGEFSEG YD PTVENTYS KI 
VTLGKDE FHLHLVDTAGQDE YS I LP YS F I IGVHG YVLVYSVTS L 
HSFQVI ES LYQKLHEGHGK 


7074 


263 


1003 


VCPVI^STRQEPGHSSLVTYFGKPTRRKSFLLGHCIAAGKMNIS ' ' 
VDLETN YAEttVLDVGRVTLG ENSRKKMKDCKLRKKQN ER VSRAM 
CALLNSGGGVI KAEIENEDYS YTKDGIGLDLENSFSNILLFVPE 
YIJDFMQNGNYFL IFVKSWSLNTSGLR ITTLSSNL YKRDITS AKV 
MNATAALEFBKDMKKTRGRIiYLRPELLAKRPRVDIQEENNMKAL 
AGVFFDRTELDRKEKLTFTESTHVEI 


7075 


598 


1005 


N YINFF FRKE YP PHVQ KVE INPVRLS RLQGVERIMKKTEE S ESQ '" 
VEPEI KRKVQQKRHCB TYQ PTPPLSPAS KKCI/THL EDLQRNCRQ 
AITLNESTGPLLRTS IHQNSGGQKSQNTGLTTKKFYGNNVEKVP 
IDII 


7075 


279 


1049 


IjQSBSSNAAEGNEQRHEDSQRSKRGGWSKGRKRKKPIiRDSNAPK 
SPLTGYVRFMNERREQLRAKRPEVPFPEITRMLGNEWSK1PPEE 
KQRYLDEADRDKERYMKELEQYQICTEAYKVFSRKTQDRQKGKSH 
RQDAARQATHDHEKETEVKERSVFDI PI FTEEFtiNHSKAREAEL 
RQLRKSNMEFEERNAALQKHVESMRTAVEKLEVDVIQERSRNTV 
LQQHLBTLRQVLTSSFASMPLPEXGETPTVDTIDSYM 


7077 


3 


1119 


SSMGSNSEINGLAIiRKTDKYGFLGGSQYSGSLKSSIPVDVARQR 
EIiXWLDMFSNWDKWLSRRFQKVKLRCRKGIPSSltRAKAWQYIjSN 
S KELLEQN PRKFEELBRAPGD P KWLDVI EKD LHRQFP FHEMFAA 
«.V7\jnjjuyuiji t^XLt)SJ\i x JLxKFJJiltaxLUAQAPVAAVLLMHMPAEQ 
AFWCLVQICDKYLPGYYSAGLEAIQUDGBIFFALLRRASPLAHR 
HLRRQRIDPVLYMTEWFMCIFARTLPWASVLRVWDMFFCEGVKI 
IFRVALVLLRHTLGSVElOiRSCQGMYETMEQLRKFLPQQCMOBDF 
LVHEVTNLP\rrEALIERENAAQIiKKNRETRGELQYRPSRRLHGS 
RAIH5ERRRQQPPLGPSSS 


7078 


483 


767 - " 


FQGQRMAGEQKPSSNI.LEQFILLAKGTSGSALTALISQVLEAPG - 

VYVFGEU^EUU^QEIJ^GAl^YI^IiNLFAYGTYPDYIANKE 

SLPELY 


7079 


2 

! 


376; ■ 


SWEFKRPKE PSGSDGESDGPI DVGQEGQLSQMARPLSTP3SSQ 
MQARKKRRG I IEKRRRDRINSS L5ELRRLVPTAFEKQGSSKLEK 



594 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
{A=Alanine, ^Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Iieucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, T^Threonine, V»Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\epossible nucleotide insertion) 








ABVLQMTVDHLKMLHATGGTGTHALLFQAS FIQQI F 


7080 


200 


595 


vq lplea pclslls crdhsggnrdlsrrhrdcrv ygs pqdgi p y " 
lthplchodwsvgrlqiralatpghtqghlvylldgepykgps 
clpsgdllplsgcgefprkreelgeegetevraatvpwralkp ' 




213 - 


506 


AVTEEEMII*NSLSLCYHNKLILAPMVRVGTIiPMRLIiAI»DYGADI " 
VYCEEL I E LKM I QCKR WNEVLS T VDFVAPDDRWFRTCEREQN 
RWFQMGTS 


7082 


3 


1137 


AP5RNTMLMAW CR GPVLLCLRQGLGTNS FLHGLGQElPFEGA!R"3E 
CC R S SPRDLRDGEREH B AAQRKAPGAES CPSLPLS I SD IGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
QAQSATEVEERHVSPSCSTSRERPFQAGELILAETGEGETKFKK 
LFRLNNFGLLNSNWGAVPFGK I VGKFPGQ I LRS S FGKQ YMLRRP 
ALED Y WLMKRGTAITFPKDINM II»S MMD INPGDTVLEAGSGSG 
GMSLFLSKAVGSCK5RVISFEVRKDHHDLAKKNYKHWRDSV7KLSH 
VEEWPDNVDFI H1QDIS GATED I KSLT FDAVALDMLNPH VTL PVF 
YPHLKHGGVCPVYWN I TQVIELLD 


7083 


115 


541 


RSWAVQLTRME YAM KS L S LLY PKS LS RH VS VRTS VVTQQLLSEP ~ 
SPKAPRARPCRVSTTulRSVRKGIMAYSLEDLLLKVRDTlJ^LADK 
PFFLVLEEDGTTVETEEYFQALAGDTVFMVLQKGQKWQPPSEQG 
TRHPLSLSHK 


7084 


3 


522 


NS VS VSSQSRFIiASVPGTGVQRSAAADMAASTAAGKQR I PKVAK ~ 
VKNKAPAEVQITAEQLLREAKKRKLELLPPPPQQKITDEEELND 
YKTjRKRKTFEDWI RKNRTVI SKIW I KYUOWFT? <3T»K"R t nu a o c tvt? 

RAI^VDYRNITLWLKYAEMHMKNRQVNHARNIWDRAITTL 


7085 


243 


1499 


RQLARLRRRG WRS PFGGAPMAHIT 1 NQYLQQVYEAI DSRDGASC 
AELVSFKHPHVANPRliOMASPEEKCOOVTjRPPYnPMTyAam oot* 
YAVGNHDFIEAYKCQTVIVQSFLRAFQAHKEENWALFVMYAVAL 
DLRVFANNADQQLVKKGKSKVGDMLEKAAELLMS CFRVCASDTR 
AGIEDSKKWGMLPLVNQLFKIYFKINKLHLCKPLIRAIDSSNLK 
DD YSTAQR VTYKY YVGR KAMFDSDFKQAEE YLSFAFEHCHRSSQ 
KNKRMILIYLLPVKMLLGHMPTVELLKKYHLMQPAEVTRAVSEG 
NLLLLHEAIAKHEAFFIRCGI FLILEKLKI IT YRNLFKKVYLLL 
KTHQLS LDAF LVALKFMQVEDVD IDEVQCILANL I YMGHVKGY I 
SHQHQKIiWSKQNPFPPLSTGC 


7086 


256 


525 


ILAARMGKQNS KLRPEVMQDLLESTDFTEHEIQEW YKGFliRDC? 
SGHLSMEEFKKI YGNFFPYGDAS KFAEHVFRT FDANGDGT I DFR 
BF 


7087 


166 


723 


LSGS SAGKVAAP CVPP SNHELVPITTENAPKNWDKGEGASRGG ' "' 
NTRKS LEDNGSTRVTPS VQPHLQP1 RNMSVS RTMEDS CELDIiVY 
VTER1 IAVS FPSTANBENFRSNLRE VAQMLKSKHGGNYLIjFNLS 
ERRPD I TKLKAKVLEFG WPDLHTPALEKICS I CKAMDTWLNAHP 
HRCRVLHNKG 


7088 


104 


759 


GTS AAS PSS LLEMAGE ITETGELYSS Y VGLVYMFNL 1 VGTGALT " 
MPKAFATAGWLVSLVLLVFLGF^FMTXTFVIEAMAAANAQLHW 
KRMBNLKEEEDDDSSTASDSDVLIRDNYERAEKRP ILSVQRRGS 
PNPPEITDRVEMGQMASMFFNKVGVNLFYFCI I VYL YGDLA I YA 
AAVPFSLMQVTCSATGNDSCGVEADTKYNDTDRCWGPLRRVD 


7089 


33 


1775 


S VC WEDRYLKARME ES PLSRAPSRGG VNFLNVART Y I PNTKVEC 
HYTLPPGTMPSASDWIGI FKVEAACVRDYHTFVWS SVPESTTDG 
S P IHTS VQ FQAS YLPKPGAQLYQFRYVNRQGQVCG QSPPFQFRE 
PRPMDELVTLEEADGGSD1LLWPKATVLQNQLDESQQBRNDLM 
QLKLQLEGQVTBLRSRVQEIiERALATARQEHTELMEQYKGISRS 
HGE ITEERDILSRQQGDHVARI LELEDDIQTI S E KVLTKE VE LD 
RliROTVKALTREQEKLtiGQLKEVQADKE QSEABLQVAQQENHHL 
NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAELEP 
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Amino acid segment containing signal peptide 
(AaAlanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G^Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L« Leucine, M^Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, TwThreonine, VeValine, 
W tryptophan, Y»Tyrosine, X^Unknown. *=Stop 
Codon, /^possible nucleotide deletion, 
\-»possible nucleotide insertion) 








LKEQFiRGAQELAASSQQKATLLGEELASAAAARDRTIAELHRSR 

i^a£vngkla£lglhlkeekcqwskeragllqsveaekdkilk 
ls ae i lrlekavqee rtqnqv pktelarexd s s l vql s eskrefl 
telrsalrvlqkekeqlqeekqelleymrklearlekvadekwn 
edattede baavgls cpaaltdsedespedmrlhpmapvsvbtq 

ASLLLGLE 


7090 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTyiPNTKVEC 
HYTLP PGTMPSASDWIGI FKVEAACVRDYHT FVWSS VPESTTDG 

or* j.ni ovyruAa I UlrRrviAUuXSif KX VNKUVjU vUUUoVPtf\{IrKJS 

PRPMDELVTLBEADGGSDILLWPKATVLQNQLDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYXGISRS 
HGEITEERDILSRQQGDHVARILELEDDIQTISEKVLTKEVBLD 
RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL 
Nl^LKEAKSWQEEQSAOAORLKDKVAQMKDTLGQAQQRVAEriEP 
LKBQLRGAQELAASSQQKATLLGEELASAAAARDRTIAELHRSR 

LSAEILRLEKAVQEERTQNQVFKTKTJVREKDSSLVQI^ESKREL 
TELRSALRVLQKEKEQLQEEXQELLSYMRKLEARLEKVADBKWN 

ASLLLGLE 


7091 


186 


1076 


EEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSLIEAY 
ALHKQMRI VKP KVASMEEMATFHTDAYLQHLQKVSQ EGDDDHPD 
SIEYGLGYDCPATEGIFDYAAAIGGATITAAQCLIDGMCKVAIN 
WSGG WHHAKKDBASGFCYLNDAVLC TL.Rr.RU lfPTRTT.Wnr nT a 
HGDGVEDAPS FTSKVMTVSLHKFS PGPFPGTGDVSDVGLGKGRY 
YS VNVPIQDG IQDEKYYQ I CER YEPPAPNPG L 


7092 


522 


809 


KQG INEDQEESQKPRLGEGCEPIS KRQMKKL IKQKQWBEQRELR 
KOKRKEKRKRKKLEROCOMEPNSDGHDR KRVR RnWw^TT.w t.tt 
DCSFDXLM 


7093 


454 


655 


nfgvsgvelaqqasmwmsfViaacqlvlgllmtsLtessiqns 
ecpqlcvce irpwftpqst yrea 


7094 


2 


508 


FVRSMHWGVGFAS SRPCWDLSWNQSISFFGWWAGSEEPF3 FYG 
DI IAFPLQDYGGIMAGIiGSDPWWKKTLYLTGGALLAAAAYLLHE 
LLVIRKQQEIDS KDAI I LHQFARPNNGVPSLS P FCLKMBTYLRM 
AOL P YQN YFGG KLSAQGKM PWI E YNHEKVSGTE F 1 1 


709S 


1 


411 


IASSLPKMASLLQSDRVLYLVQGEKKVRAPLSQLYFCRYCSELR 
SLECVS HE VDSH YCPSCLENM PSAEAKLKKNRCANCFDG PGCMH 
TLSTRATS ISTQ L PDDPAKTTM KKAYYLACX3 F CRWTS RDVGMAD 
KSVGE 


7096 


224 


2067 


ETRSLAVQEKP S QAGRRRS5RI S FAGALFLTR FLLQE LLLNN FC 
SAMS PAPDAAPAPAS I S LFDLSADAP VFQGLSLVSHAPGE ALAR 
APRTS CSGSGERESPERKLLOGPMDI SEKLiFCSTCDOTPOWHriP 
QREHYKLDWHRPNLKQRLKDKPLLSALDFEKQSSTGDLSSISGS 
EDSDSASEEDLQTLDRERATFEKLSRPPGFYPHRVLFQNAQGQF 
LYAYRCVI/3PHQDPP BEAELLLQNLQS KGPRD CWLMAAAGHFA 
GAI FQG R EWTHKTFKR YT VRAKRGTAQGLRD ARG GPSHSAG AN 
LRRYNEATLYKDVRDLLAGPSWAKALEEAGT I LLRAPRSGRSLF 
FGGKGAPLQRGDPRLWDI P LATRRPTFQELQ R VLHKLTTLKVYE 
EDPRE AVRLHS PQTHWKTVREERKKP TEEE I RKI CRDEKEALGQ 
NEESPKQGSGSEGEDGFQVELELVBLTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQ E E EPS TQSSQAVAAPLGP L 
LDEAKAPGQPELWNALLAACRAGDVGVIJCLQLAPS PAD PRVLSL 
LSAPLGSGGFTLLHAAAAAGRGSVVRLLLEAGAJDPTVQCQDH 


7097 


256 


1228 


I RTKSAATWEAWP QCGREGSRI ITE P CEANAGSRQELQTBRI SS 
FIAAQGDQAFHSGLETNNSNS ELPLRVGLKVAQGS PLMGGQ VSA 
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ID 
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beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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Amino acid segment containing signal peptide 
{A*Alanine, C-Cystcinc, D»Aspartic Acid, 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K» Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, YoTyrosine, X-Unknown, *=Stop 
Codon, /-.possible nucleotide deletion, 
\=possible nucleotide insertion) 








SNSFSRLHCRNANEDWMSALCPRLmiVPIJiHLSIPGSHPTMfYCn 
IiNKKS PISHEESRLLQLIiNKALPCI TRPWLKWS VTQALDVTEQ 
IJW3VRyijDLRIAHML£GSEKNI^FVHMVYTTALVEDTl.THISB 
WLERHPREWILACRNFEGLSEDLHEYLVACI3CNIFGDMLCPRG 
EVP TLRQLWSRGQQV I VS YEDE SSLRRHHELWPGVP YWWGNRVK 
TEALIRYLRTMKSCGR i 


7098 


82 


956 


ssflkrcrkvlgcwgipseqslfstleeprdisidnycvmrCqt I 

EARSGFWAPNRFP VN I CRMTAVDGDRGGSSRETCRCHFH P S LEA 
LVLLLQDWQPGGVGI CTS FLGISWALLDYHRALRTCLPS KPLLG 
LGS S VI YFLWNLLLLW PRVLAVAL FSAL FPSYVALHFLGLWLVL 
I>LW VWLQGTD FMPDPS S E WL YRVTVATI L YFS W FNVAEGRTRGR 
A3 IHFAFLLSDSILLVATWVTHSSWLPSG1PLQLWLPVGCGCFF 
LGLALRLVYYHWLHPSCCWKPDPDQVD | 


7099 


992 


210 


lfrlapgflrslarqgykqiwafpflpsgatatwpaasrsrsla" 
arslprsparpgpndal lgehdfrgqgvraqrfrfseb pgpgad 
gavlevhvpqigagvslpgilaakcgaevilsdsselphclevc 
rqscqmnnlphlqvvgltwghiswdllalppqdiilasdvpfep 
ed fed i lat i y plmhkn p kvqlwsty qvrsadwsleall ykwdm 

KCVHI P LES FDADKED I AESTL PGRHTVEMLVISFAKDSL | 


7100 


205 


671 


ANGGFWEAAPGSE^SLPUm»TASHSKTTAI^IGSAPPPHLSVlH 
FLFSFPPQIiGDPLEAFPVFKKYDRNGLNVSlECKRVSGLEPATV 
DWAFDLTKTNMQTMYEQS EWGWKDRE KREEMTDDRAW YLI AW EN 
S S VP VAFSHFRFDVERGDE VLYW | 


7101 


2 


503 


WRGGPRRAKRIJU3GAVGWVI^VRGVHS\/RAGGGRP 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTSRWIPL1NERTDKDSRLPLILGGNKSDLVBYSR \ 


7102 


2 


503 


WRGGPRRAKRLAGGAVGWVLLVRGVHSVRAGGGRPPRAADMKKD 1 
VRILLVGEPRVGKTSLIMS LVSEEFPEEVPPRAEEITI PADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTS RW I PL INERTDKDS R L P L ILGGN KSDLVE YSR j 


7103 


119 


438 


GSQSSVAVNIRSGTDEESMDLMNGQASSVNIAATASEKSSSSES I 
LSDKGS ELKKS FDAWFD VLKVT PEE YAGQ ITLMD VP VFKAI Q P 
DELSS CGWNKKEXYSSAP ~~ | 


7104 


1670 


795 


RLWEHRSVSAGASGWGLSSPGCLLIjHPSLPEEERVDILirmAGvH 
MRCPHWTTEDGFEMQFGVNHLGEAWAGAAPWVQAIT.PRRPPKVL I 
GF*V* VKSDLFI ILNPGHFLLTNLLLDKLKASAPSR I INLSSLA 
HVAGH IDFDDLNWQTRKYNTKAAYOQS \ KLAIVLFTXELSRRLQ 
GSGVT VNALHPGVARTEIjGRH TG IHGS TFLQHHN\ WAHLLAAWS 
KS PRS W PAP AQHNTLAVAEELA\ VISG KYFDGLKQKA PAP EAED 
3EVARRLWABSARLVGLEAPSVREQPLPR | 


7105 


765 


143 


GQMCRRPSPKSTSCI^I^CDLP/RGIiQDPQCIiAJLFRVATOIWOaH 
LLKAAMSGQGVDRHLFAL Y I VSRFLH LQS P FLTQ VHSEQWQLST 
SQIPVQQMMLFDVHNYPDYVSSGGGFGPADDHGYGVSYIFMGDG 
kix a r« j.o:>K.*\£>i> i R.iiJbMKbuQHIEUALLDVASLFQAGQHFXRR 1 
FRGSGKENSRHRCGFLSRQTGASKASMTSTDF j 


710S 
7107 j 


14 
1145 


1064 
591 


glqaghphprsasripeadth\ysklqrafdsivnkdhkrmfgt J 

YFRVGFFGSKFGDLDEQEFVYKEPAITKLPEISHRLEAFYGQCF 

gaefvevikdstpvdktkldpnkayiqitfvepyfdeyemkdrv 
ty feknfnlrr fm yttp ftlegr prgelheq yrrwtvlttmhaf 

PYnCTRISVIQKEEFVLTPIEVAIEDMKKKTLOLAVAINQEPPD 
AKMLQMVLQGS VGATVNCX3PLE VAQVFLAE I PADP KLYRHHNKL 
RLCFICEF IMRCGEAVEKNKRLI TADQR3 YQQELKKNYNKLKBNI* 
RPMIERKIPELYKPIFRVESQKRDSFHRSSFRKCETQLSQGS | 
♦I*WLQTGKKK 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


" Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, OCyeteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KoLysine, 
I»=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknovn, *=Stop 
Codon, /-possible nucleotide deletion,. 
\=possible nucleotide insertion) 


7108 


1 


542 


VK VALLLTNLE Q P RTES E WENS PTLKM FL FQ FVNLNS S T FYI AF 
FLGRFTGHPGAYLRLlNRWRIiEECHPSGCLIDLCMQMGI IMVLK 
QTWNNFMEICyPLIQNWWTRRKVRQEHGPBRKISFPQWEKDYNL 
QPMNAYGLFDE YLEMlLQFGFTTIFVAAFPtAPLLALLNNI I E I 
RLDAYKFVTQWRRPLASRAKDIGIWYGILEGIGILSVITNAFVI 
AITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRIS 

DFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQP 
WHVLAW 


7109 


964 


102 


wdqrkrnslvpgpahgpaqeepwekkeslgaaqealsiqlqpke" 

TQPFPKSEQVYLHFIiSWTEDGPEPKDKGSLPQPPITEVESQVF 

seklatdtstfeatsbgtlelqqrnpkaerlrwspaqeesfrqm 

WIHKEIPTGKKDHECSECGKTFiyNSHLWHQRVHSGEKPYKC 

sdcgktfkqssnlgqhqrihtgekpfecnbcgkafrwgahlvqh 
qrihsgekpyecnecgkafsqssylsqiirrihsgekpfickecg 
kaygwcsbli rhrrvharke psh 


7110 


96 


697 


RLDN PSGFLVEVTKEERH I V KPLYDRYRLVKQM LTRAS I T PVtiG 
SPSTiCRRGQMLQPIIEGETAHFFEBIKEEEEDGVNLSSELGDML 
KTAVQ VQSS LKMTS ESDVE ENQEKLALDLRLSS S RAASMPE LLEQ 
LWKARAEKKKLRiCTLREFEEAFYQQNGRNAQKEDRVPVLEEYRE 
YKKIKAKLRLLEVLISKQDSSKSI 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFALELNELTAE" 
LKRSLPSTDTRLRPDQRYLEEGNIQAAEAQKRRIEOLORDRRKV 
WEENNIVHQARFFRRQTDSSGKEWWVTNNTYWRLRAEPGYGNMD 
GAVLW 


7112 


103 


495 


PRCFPVADRGRLIGGLPDWTIMEGKTI^LTCTVF'GNPDPEVIW 
FKNDQD I QLSEHFS VKVEQAKYVSMTI KGVTSEDSGKYS INIKN 
KYGGEKI DVTVS VY KHG E K I PDMAPPQQAKPKL I PAS AS AAGQ 


7113 


1 


824 


KCLRQAWHEAPSSLAFTRWCSRBERAEGGGNLHRSITRDPICPPG"" 

IiRPSQRPMDDKKKKRSPKPCLAQPAQAPGTLRRVPVPTSHSGSL 

ALGLPHLPSPKQRAKFKRVGKEKGRPVLAGGGSGSAGTPJjQHSF 

LTEVTDVYEMEGGLLNLLNDFHSGRLQAFGKECSFEQLEHVREM 

QEKLARLHFSLDVC5GEEEDDEEEEJDGVTEGLPEEQKKTMADRNL 

DQLLS NLGSC LG ALVPG GMRGG EGT YS QSHSKALGE KVGVHG S K 

SSGPLNLPRR 


7114 

• 


3 


1492 


VWEVDEQIDHYKESQDKFLWQAAFIGKETLKDESGQECK1CRKI 
IYLNTDFVSVKQRLPKYYSWBRCSKHHLNFLGQNRSYVRKKDDG 
CKAYW KVCLH YNLHKAQ PAERFFDPNQ RGKALHQKQAJLRKS QRS 
QTGEKL YKCTECGKVFIQKANLWHQRTHTG EKP YECCECAKA F 
SQKSTL IAHQRTHTGEKP YE CSECG KTFIQKSTLI KHQRTHTG E 
KPF VCDKCPKAFKS S YHL I RHE KTH I RQAFYKGI KCTTSS L 1 YQ 
RIHTSEKPQCSEHGKASDEKPSPTKHWRTHTKENIYECSKCGKS 
FRG KSHLS VHQRI HTGEKPYECS ICGKTFSG KS HLSVHHRTHTG 
EKP YECRRCGKAFGEKSTLI VHQRMHTGEKP YKCNE CGKAFSEK 
SPHKHQRIHTGERPYECTDCKKAFSRKSTLIKHQRIH1GEKPY 
KCS ECGKAFSVKS TL I VHHRTHTGEKPYECRDCG KAFSGKSTL I 
XHQK SHTGDaNL 


7115 


1 


947 


KAAHGYNWGLWCM Y 1 I PPQD WLDRGDESAP I RT P AMI GCS FWD " 
REYFGDIGLLDPGMEVYGGENVKtiGMRVWQCGGSMEVLPCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEVWMDDFKSHVYMAWNIPM 
SNPGVDFGWSERLAIiRQRLKCRSFKWYLENWPEMRVYNNTLT 
YGEVRNSKASAYCLDQGAEDGDRAILYPCHGMSSQLVRYSADGL 
LQLGPLGS TAFLPDSKCLVDJDGTGRMPTLKKCEDVARPTQRJLWD 
FTQSG P I VSRATGRCLEVEMS KDANFGLRLWQRCS GQKWM1 RN 
WIKHARH 


7116 


866 


95 


RVRMRRNAEVIEEKLSMKSWAKFRPGBPWKGYPNIDPETDPYVT 
PGSVINNLSINTVREVDHLRDRKSGSSSSLNTTLPSTSAWSSIR 
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ID 
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Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aapartic Acid, e« 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H^Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=?ryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ASNYNVPLS STAQSTS ARNSDSKIiTWS PGS VTNTSLAHE LWKVP ~ 
LPPKNITAPSRPPPGLTGQKPPLSr^DNSPLRIGGGWGNSDARY 
TPGSSWGBSSSGRITNWLVLKNLTPQIDGSTLRTLCMQHGPLIT 
FHIiNLPHGNALVRYS SKEE WKAQKSLHT SDLFLLTL 


7117 


695 


1261 


LIiISTPGGCHPPPSSIEFTYTGAWGKALPAPHMPCAPGAijPQGA " 
FVSQAARAI PLLQPS QAAQAEGLSQPARACGALCSLPW PLRNWG 
SPI LRLPGGLRTPTNDRKTRTRSAMACWARAQWDTLGPLKLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGT FTSGARDPGGL 
RVKHRCQPTGHLP 


7118 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVSE^SLLQPQV^ 
ESVliNLGKFHS I VRLVAFCPFASSQVALENANAVSEGWHEDIiR 
LLLETHIiPSKKKKVLLGVGDPKIGAAIQEBLGYNCQTGGVIAEI 
LRGVRLHFHNLVKGLTDLSACKAQLGIiGHSYS RAKVKFNVNRVD 
NMIIQSISLLDQLDKDINTFSMRVREWYGYHPPELVKIINDNAT 
YCRliAQFIGNRRELNBDKLEKLBEIjTMDGAKAKAlLDASRSSMG 
MDISAIDLIKIESFSSRWSLSBYRQSLHTYLRSKMSQVAPSLS 
AL IGEAVGARIiI AHAGS LTNIAKY PASTVQI LGAEKALFRALKT 
RGNTPKYGLI FHSTFIGRAAAKNKGRISRYLANKCS I ASRI DCF 
SEVPTS V FGE KLREQVBSRLS F YETGE I PRKNLDVMKBAMVQAE 
F-AAAE I TRKL E KQEKKRLKKEKKRLAALAtASSENSS STP EE CE 
EMSE KP KKKKKQ KPQ B VPQENGKEDPS I S FSKP KKKKSFSKEEL 
MSSD LE ETAG S TSI P KR KKSTPKEBTVNDPEE AGHRSGSKKKRK 
FSKE E P VSSG PE EAAGKS SS KKKKKFHKASQED 


7119 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLAl.KEVEEISLLQPQVE ~ 
ESVLNLGKFHSIVRLVAFCPFASSQVALENANAVSEGWHEDLR 
LIjLETHLPS KKKKVLLG VGDPKIGAAIQEEIX3 YNCQTGGVI AE I 
IJiGVRLHFHNLYKGtiTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMIIQS ISLLDQLDKDIWTFSMRVREWYGYHFPELVKI INDNAT 
YCRLAQFIGNRRELNEDKLBKLEELTMDGAKAKAILnASRSSMG 
MDI SAIDLIN IESFS SRWS LSE YRQS LHTYLRS KMSQVAPSLS 
ALIGEAVGARLIAHAGSLTNLAKYPASTVQILGAEKALFRALKT 
RGNTP KYGL I FHSTFIGRAAAKNKGRISRYLANKCS X ASR I DCF 
SEVPTSVFGEKLREQVEERLSFYETGE I PRKNLDVMKEAMVOAE 
E AAAE ITRKLEKQEKKRLKKEKKRLAAIiAXAS S ENSSSTP E ECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKS PS KEEL 
MSSDLEETAGSTS IPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEP VSSG PE B AAGKSS S KKKKKFHKASQED 


7120 


1991 


64 


QLGTRJICLRGDKVTNAMQDFLVTNLEPRFIEPQTANLSVVFKDS 
NSTTPLI PVLS PGTDPAADL YKFAEEMKFSXKLSA IS LGQGQG P 
RAEAMMRSSIERGKWVFFQNCHtiAPSWMPALERLIEHINPDKVH 
RDFRLWLrSLPSNKFPVSILQNGSKMTIEPPRGVRANLLKSYSS 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNI PY 
BFTDGDLRICISQLKMFLDEYDDI PYKVLKYTAGEINYGGRVTD 
DWDRRCIMNILEDFYNPDVLSPEHSYSASGlYHQIPPTYDIiHGY 
LSYIKSLPLNDMPEI FGLHDNANITFAQNETFALLGTI IQLQPK 
SSSAGSOX^BIVEDVTQNILLKVPEPIIMWVMAKYPVLYEES 
MNXVLVQEVIRYNRLLQVlTQTLQDLLKALKGIiVVMSSQLBLMA 
ASLYNNTVPELWSAKAYPSLKPLSSWVMDLLQRLDFLQAWIQDG 
IPAVFWrSGFFFPQAFLTGTLQNFARKFVISIDTISFDFKVMFE 
APSELTQRPQVGCYIHGLFLBGARWDPEAFQLAESQPKELYTEM 
AVIWLLPTPNRKAQDQDFYLC PI YKTLTRAGTLSTTGHSTNYVI 
AVE I PTHQPQRHWI KRGVALI CALDY 


7121 


2 


546 


RPLRPWVLSLGSMVGLMTYGRRQFQSLDTTMRRL I PPFREAS AK 
LTTLVDADAEAFTAY LEAMRLPKNTPEE KDRRTAALQEGLRRAV 
S VPLTIAETVAS LWP ALQ PJjARCGNLACRSDLQVAAIQ\LEMGVF 
GAYFNVLINI^DITDEAFKDQIHHRVSSLLQEAKTQAALVLDCL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to tirst . 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fa Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, NaAsparagine , 
P*Proline, Q=Glutamine, R^Arginine, 
S=Serine, T«Threonine, V«Valine, 
W»Tryptophan, Y-Tyrosine, X=Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 
ETRQE " " ~ — 


7122 


2 


546 


RPLRPWVLSI^SMVGLMTYGRRQFQSijDTTMRkLIPPFREASAK 
LTTLVDADAEAPTAYLEAMRLPKNTP EE KDRRTAALQEGLRRAV 
S VPLTLAETVASLW PALQELARCGNLACRSDLQ VAAKALEMGVF 
GAY FNVL INLRD I TDEAFKDQ IHHRVSSLLQEAKTQAALVLDCL 
ETRQE 


j 7123 


1 

* 


1092 


K PAVPE ARSAGTS EAGRSGAEE VS CGS VSG DG AAMRLTPRALCS 
AAQAAWRENFPLC^RDVARWPPOHMAKGLKKMQSSLKLVDCl IE 
VHDAR I PL SGRN P LFQETI/3I*KPHLLVLNKMDLADLTBQQKI MQ 
HLEGEGLKNV1PTNCVKDENVKQIIPMVTELIGRSHRYHRKENL 
8 YC IM VIG VPNVG KS SI* INSLRRQHLRKGKATR VGGBPG I TRAV 
M S KI QVSERPLMFLLDTPG VLAPR IESVETGLKIiALCGTVLDHL 
VGBETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVliKSVAV 

XLGKTQKVKVLTGTGNVNVIQPlfrPAAARDFLQTFRRGLLGSVM 
LDLDVLRGHPRV 


7124 


2 


382 


i.PiTLLIJUlPFAHJLLLPP^HbQSPCWHPGPALSPGTLGPLSWAM 
ANSG LQLLG Y FLALGGWVGI I ASTAI* PQWKQS S YAGDAS I QLR3 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


7125 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRR^CGSSESRGVNESHXSE^ 
FI ELR KWLKARKFQDSNLAPACFPGTGRGLMSQT5LQEGQMI IS 

LPESCLLT\ rdtvtrsylgayitkwkp ppspllalctflvsekh 

AGHRS LLEA \ Y LE I L PKA YTC PVCL E PE WNLL P KS LKAKAEEQ 

rahvqeffassrdffsslqplfaeavdsifsysallwawctvnt 
ravyl \spgsgnaflqsrtpvqlapyldllnhs phvqvkaafne 

ETHSYEIRTTSRWRKIIEEVFICYGPHDNQRLFLEYGFVSVHNPH 

acvyvsrgwnglcs 


7126 


1 


733 


CRDMAAFIVPSPARRCSQKGSL3HLPTQPWLWAAMSPRGQERGT 

shsqarepqrpgrwllgslqsspgtlgqagtasrrrgcmvqrwv 

QVATGRRAVQVPKGAI^IiAIjGETSPGASRGMSGGAGGCWALGWA 

pspvlpswllegpppwlsiisdsgtqrpsprrcparpspwgpqc 
wrggriasaeasst*tpgsgsrarsgrrspgsrrrsasapsptp 
ptdaca* scvarpagsrssrpaaa 


7127 


1311 


277 


GLPAMCST* KAGYYEETEGDG 1 PKDR* IEKRPFKEI * RRIPRI F 
AKQKQ 1 *S*NSQKIGASEIDRGRKEADCSDAPAAARIGAVSVFR 
RSTQEARVSPRSNAKSANLRAVRAD*VJEHFVLLFHTPEQFLAEC 
ICRST* *K* WHQLC*PliSSL*TGI»KRKLLL*VLFRI *WLKDCDV 
* FCQKI F ATNFCNWQNLIQ* EE * KP VE YS VEN* H IMNLLLPM * I* 
OQSS LRDQT I VTWRM * RN YS MFRINM I SS I**DGS I H I PLKLHFY 
PALIFTLTVPlNSCCQRPLPLFAHQSIKTIiASSGSPMIACLRFL 
LVKKRAFIHTPRS PGCS V* CXHVLVKDNKNNCVGSEV 


7128 


2 


5228 


GR VDLWTI LLGRS ALRELSQ I E AB LN KHW RRLL BGLS YYKP PS P 
SSAEKVKANKDVASPLKELGLRISKFLGLDEBQSVQLLQCYLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEBRTCILRCVLHL 
LTYFQDERHPYRVEYADCVDKLEKELVSKYRQQPEELYKTEAPT 
WETHGNLMTERQVSRWPVC^LRBQSMLLEI IFLYYAYFEMAPSD 
u^vuitsjnt Kt,QG FGS RQTNRHIi VDETMDP FVDR IGYFSAL I LVE 
GMDIESLHKCALDDRRELHQFAQDGLI CQDMDCLMIiTFGDI PHH 
AP VLLAWALLRHTLNPEETS S VVRKIGGTAI QLNVFQ YLTRLLQ 
SLASGGNDCTTSTACMCVYGLLSFVLTSLELHTLGNQQDIIDTA 
CEVLADPSLPBLFVJGTEPTSGLGIILDSVCGMFPHLLSPLLQLL 
RALVSGKSTAKKVYSFLDKMSFYtTELYKHKPHDVISHEDGTLWR 
RQTPKIJLYPUX5C^NLRIPQGTVCQVMLDDRAYLVRWEYSySSW 
TLFTCE IEMLLHWS TAD VIQHCQRVKP 1 1 DLVHKVT STDLS IA 
DCtiPITSRIYT^LQRLTTVlSPPVDVIASCVNCLTVLAARNPA 
KVWTDLRHTGFLPFVAHPVSSLSQMISAEGMNAGGYGMLLMNSE 
QPQGEYGVTIAFLRLITTLVKGQLGSTQSQGLVPCVMFVUCEML 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
CA^Alanine, C=Cysteine, D^sparbic Acid, E» 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K«Lysine, 
LsLeucine, M«Methionine, N»Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S^Serine, T«Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PS YH KWRYNSHG VR EQI GCLI LELI HAI LNLCHETDLH S SHTPS 
LQFLCICSLAYTEAGQTVINIMGIGVDTIBMVMAAQPRSDGAEG 
QGQGQLLI KTVKLA FS VTNNVIRLKP PSNWSPLEQAL S QHGAH 
GNNLIAVIaAKYI YHKHDPALPRLAJ QLLKRLATVAPMS VYACLG 
NDAAAIRDAPLTRLQSK\ IE\DMRI K\VMIL\EFLTVA \VETQP 
GLIELFLNLEVKDG\SDGS KEFSLGM W\ SCLHAV/ VWEL I DSQQ 
QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 
SPLFGTLSPPSETSEPSILETCALlMKIICLEiyYWKGSLDQP 
LKDTLKKFS I EKRFAYWSG YVKS LAVHVAETEG SS CTS LLE YQM 
LVSAWRMLLIIATTHADIMHLTDSWRRQLFLDVLDGTKAliLLV 
PASVNCLRLGSMKCTLLL IXjLRQWKRELGS VDE ILG PLTEILEG 
VLQADQQLMEKTKAKVFSAFITVLQMKEMKVSD I PQYSQLVLNV 
CETLQEEVIALFDQTRHSLALGSATEDKDSMETDDCSRSRHRDQ 
RIX3VCVIX3LH1AKELCEVDEDGDSWLQVTRRLP I LPTLLTTLE V 
SLRMKQNLHFTEATLHLLLTLARTQQGATAVAGAGITQS1CLPL 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSLMEQLLKT 
LRYNFLPEALDFVGVHQERTLQCLNAVRTVQSLACLEEADHTVG 
F I LQLSN FMKE WH F HL PQLMRDIQVNLG YLCQACTSFLHSRKML 
QHYLQNKNGDGLPS AV\ AQRV\ QRP PS AAS AAPS SSKQPAADTE 
ASECMALHTVQYGLLKILSKTLAALRHFTPDVCQrLLDQSLDIA 
E YNFLFALS FTTPTFDS E VAPS FGTIjLATVNVAliNMLGE LDKKK 
E PLTQAVGLS TQAEGTRTLKS LliMFTMENCFYIiI>lSQ AMRYLRD 
PAVHPRDKQRMKQELSSELSTLLSSLSRYFRRGAPSSPATGVLP 
SPQGKSTSLSKASPESQEPLIQLVQAFVRHMQR 


7129 


1 


1054 


FRRFRWRRRLH * AGPASSAGGS PGEAS GTMS GEL PPN I NI KEPR 
WDQSTFIGRANHFFTVTDPRNILLTNEQLESARKIVHDYRQGIV 
PPGLTENELWRAKYIYDSAFHPDTGBKMILIGRMSAQVPMNMTI 
TGCMOTFYRTTPAVLFWQWINQSFNAVVNYTNRSGDAPLTVNEI, 
GTAY VSATTGAVATALGLNALTKHVS P L IGRF VP FAAVAAANCI 
NI PLMRQRELKVG I P VTDENGNRLGESANAAKQA I TQWVS RI L 
MAAPGMAI PPFIMNTLE KKAFLKRFP WMS AP I QVGLVGFd#VFA 
TPLCCALFPQKSSMSVTSIiEAELQAKIQESHPELRRVYFNKGI* 


7130 


2 


780 


HEVPSIjQTSDPLPGSVQRCSVWSQPNKENWCQDHLYNSIXjRKG 
I SAKS QP YHRSQ S SSS VLI NKSMDS INYPS DVGKQQLLS I»HRS S 
RCESHQDLLPDIADSHQ<X?TEKLSDLTLQDSQKVVVVNRNLPLN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DSKFVDADFSDNVCSGNTLHSLNSPRTPKKPVNSKLGLSPYLTP 
YNDSDKLNDYLWRGPS PNQQNIVQSLREKFQCLSSSS FA 


7131 


805 


573 


AAAEGHIE WKFIi IEACKVNP FAKDR WGNI PLDDAVQFNHLE W 
KLLQDYQDS YTLSETQAEAAAEALSKENLESMV 


*132 


1420 


1087 


IDMIiI,LSGALVSGPYTLlTTAVSADLGTHKSLKGNAHALSTVTA " 
1 1 DGTGS VGAALGPLLAGLLS PSGWSNVFYMLMFADACALLFLI 
RLIHKELSCPGSATGDQVPFKEQ 


7133 


2 


3648 


QQIPGLLPAHGESGDAbRKPRLQKPITGHLDDLFFTLYPSLEKF 
EEELLELHVQDHFQEG CG PLDGGALE I LERRIiRVGVHNGLGFVQ 
RPQWVLVPEMDVAI.TRSASFSRKWSSSKTSSGSQALVLRSRL 
RLPEMVGHPAFAVI FQLEYVFS3PAGVDGNAAS VTSLSNLACMH 
MVRWAVWNPLLEADSGRVTLPLQGGIQPNPSHCLVYKVPSASMS 
SEEVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVLAAPQNSPVGPGLSISQLAASPRSPTQHCL 
AR PTSQLPHGSQAS PAQAQEFPLEAGI SHLEADLSQTSLVLE TS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
BILDANKQPAEAVSATEPVTFNPQKEESDCLQSNEMVLQFLAFS 
R VAQDCRGTS WPKTVYPTFQF YR FPP ATTPRLQLVQLDE AGQP S 
SGALTHILVPVSRDGTFDAGSPGFQLRYMVGPGFIiKPGERRCFA 
RYLAVQTLQI DVWDGDSLLLIGSAAVQMKHLLRQGRPAVQASHE 
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SEQ 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(AoAlanine, C*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S= Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, +-=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








LE WATE YEQ DNMW SGDMIX3FGRVKP IGVHS VVKGRLHLTLAK 
VGHPCEQKVRGCSTLPPSRSRVISNDGASRFSGGSLLTTGSSRR 
KHWQAQKLADVDSE LAAMLLTHARQGKGPQDVSRBSDATRRRK 
LERMRSVRLQEAGGDLGRRGTSVLAQQSVRTQHLRDLQVIAAyR 
ERTKAESI ASLLSLAI TTEHTLHATLG VAEFFEFVLKNPHNTQH 
TVTVEIDNPELSTIVDSQEWRDPKGAAGLHTPVEBDMFHLRGSL 
APQLYLRPHETAHVPFKFQSFSAGQLAMVQASPGIiSNEKGMDAV 
S PWKSS AVPTKHAKVLFRAS GGKP I AVLCLT VE LQPHWDQ VFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVKVRCSDPNV 
ICBTQNVGPGBPRDIFLKVASGPSPEIKDFFVIIYSDRWIATPT 
QTWQVYLHSLQRVDVSCVAGQLTRLSLVLRGTQTVRKVRAFTSH 
PQELICTDPKGVFVLPPRGVQDLHVGVRPLRAGSRFVHLNLVDVD 
CHQLVAS WLVCLCCRQPL I S KAF E I MLAAGEG KGVNKRI T YTNP 
YPSRRTFHLHSDHPELLR FREDS FQVGGGETYTIGLQFAPSQRV 
GBEEIIilYINDHEDKNBEAFCVKVIYQ 


7134 


2115 


1111 


GGEGFSYPPHVGLSLGTPLDPHYVLXiEVHYD^PTYEEGLIDNSG 
LRLFYTMDIRKYDAGVIEAGLWVSLFHTIPPGMPEFQSEGHCTI* 
SCLEEALEAEKPSGIHVFAVLLHAHI^GRGIRLRHFRKGKEMKL 
LAYDDDFDFNFQEFQYLKEEQTILPGDNLITBCRYNTKDRAEMT 
KGGLSTRSEMCLSYLLYYPRINLTRCA3IPDIMEQLQFIGVKEI 
YRPVTTWPFI IKCPKQYKNLSFMDAMKKFKWTKKEGLSFNKLVL 
SLPVNVRC3KTDNAEWS IQGMTALPPDI ERPYKAEPLVCGTSSS 
SSLHRDFS IKLLVCLLLLS CT3USTKS L 


7135 


2 


2072 


FVPRVTPRSLSLQGPKGESVGSITQPLPSSyjLiFRAASESDGfeC 
WIiDALELALRCSSLLRLGTCKPGRDGEPGTSPDASPSSLCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
KTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVEWSE 
ENKSLMWTLLKQLRPGMDLSR\A^PTFVLEPRSFLNKLSDYYYH 
ADLLSRAAVEEDAYSRMKLVLRWYLSGFYKKPKGIKKPYNPILG 
ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 
GSITAKSRFYGNSLSALLDGKATLTFLNRAEDYTLTMPYAHCKG 
I IjYGTMTIjELGG KVTI ECAKNN FQAQLEFKLKP FFGGSTS XNQ I 
SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 
QRLRQHTVPLEEQTELESERLWQHVTRAISKGDQHRATQEKFAIi 
EEAQRQRARERQESLMPWKPQLFHLDP ITQBWHYRYEDHSPWDP 
LKDIAQFEQDGILRTLQQEAVARQTTFLGSPGPRHERSGPDQRL 
RKASDQPSGH3QATESSGSTPESCPELSDEBQDGDFVPGGESPC 
PRCRKEARRLQAIiHEAILSIREAQQBLHRHLSAMLSSTARAAQA 
PTPGLLQSpRSWFLLCVFIiACQLFINHILK 


7136 


2 


418 


dfvpsfrrpsgotsqtvwlijiaatlekevagi^kihhlddmQc 

SQQRKVRQMIEQLQNS KAV I QS KDATIQELKEKIAYLEAEjNLEM 

hdrmehliekqishgnfstqaraktenpgsiriskppspkpmpv 

IRWET 


7137 
713B ' 


«6 


466 


wasgmstvpggsrhslgiqwggwgvtggeeesltvpvadtwqa 
gsfkvatqernpqraqmrlrrqkkgvvpflgdfltelqrldsai 

PDDlJ3GNTNKRSKEVRVLOEMOLLOV74AKniYRr.RPT PvcnrrvpT 

rmeqlsdkesyklscqlepemp 




2 


Ate 


wasgmstvpggsrhslgiqvrggwgvtggeeesltvpvadtwqa 
gsfkvatqernpqraqmrlrrqkkgwpflgdfltelqrldsai 
pddldgntnkrskevrvlqemqllqvaamnyrlrplekfvtyft 
rmeqlsdkesyklscqlepenp 


7139 


1 


357 


SLRNSARGLRMAASAARGAAAIiRRSINQPVAFVRRIpW'^AASSQ " 
LKEHFAQFGKVRR CILPFDKETGFHRGLG WVQFSSBEGLRNALQ 
QENHI IDGVKVQVHTRRPKLPQTSDDEKKDF 


7140 


1401 


195? 


RASSI^VLKAWGGLIFSSFQQQHTGQYALEELFDLKVYDCFCSF 
NMNVSLEKQLRPSQPWPRGKCRKTPGWEBARPKRQDLRGDLGKT 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal pepti5e"~ 
(A=Alanine, C=Cysteine, D^Aspartic Acid, B* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=rTyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ppossible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVDIDPI'GLQSQ 

WTPKGQDPPLMFSEDYQKSLLBQYHLGIJ)QKt»RKYVVGEl*IWNP 
ADPMTNQCG 


7141 


124 


1073 


bDSRSCWLDMBDL BED VRFI VDBTLDFGGLS PS DS RE EED I TVL 
VTPEKPLRRGLSHRSDPNAVAPAPQGVRLSLGPLSPEKLEBItD 
BANRIiAAQLEQCALQDRESAGBGLGPRRVKPS PRRETFVLKDS P 
VRDLLPTVNSLTRSTPS/LKQPDASTPE* * *EGVSQGSPGYI WK 
EALQHEEG VTHLQSVPCIQKPS I FSS\SRSTPPVRGRAGPSGRA 
AASEETRAAKLRGAAAKSSCQLPIPSAI PRPASRMPLTSRSVPP 
GRGALP PDS LSTRKGI* PR PSTAGHRVRE SGHKVP VSQRLNL PVM 
GATRSNLQPP 


7142 


658 


833 


li I FLMLHMELKMLS S VTLHIRAFLYW I CLKPTSCLl FQNVLNLL 
KK * S RAVG WWM CRT/ YS SDLQ VGVI K PWLLLGS QDAAHDLDT 
LKKNKVTH I LNVAYG VENAFLSDFT YKS ISILDLPETNILSYFP 
ECFEFIEEAKRKDGWLVHCNA 


7143 


3 


773 


SLEMSSDGEPIjSRMDSEDSISSTIMDVDSTISSGRSTPAMMNGQ 
GS T TSSS KN I AYNCCWDQCQACFNS S PDLADHI RS IHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCWGGCNA 
SFASQGGUUIHVPTHFSQQNSSKVSSQPKAKBESPSKAGMNKRR 
KLKNKRRRSLARPHDFFDAQTLDAIRHRAI CFNLSAHI ESLGKG 
HSWFHSTVSILLFF0IKYKTT.OKNT«:tttciv<3T vt 


7144 


1 


988 


frvnmqdggpspaehskaeesagmearflglpdaagssg'ptpar 

RCPAPRPAGVSYVIRDEVEKYIORNGVNALQLDPALWRLFTAGRD 
SIIRIWSVNQHKQDPY17VSMEHHTDWVNDIVLCCNGKTLISASS 
DTTVKVWNAHKGFCMSTLRTHKDYVKAIiAYAXDKELVASAGLDR 
QI FLWDVNTLTALTAS NNTVTTSSLSGNKDS I YS LAMNQLGT 1 1 
VSGSTE KVLRVWDPRTCAKLMKLKGHTDNVKALLLNRDGTQOtjS 
GS S DGT IRLWSLGQQR C IATYRVHDEGVWALQ VNDAFTHVYSGG 
RDRKIYCTDLRNPDIRVLICB 



TRADOCS: 14 1 6260. 1 (%CSK0 1 1. DOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:M786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:M786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim i under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim I. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

•13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the . 
polynucleotide of claim 1 is detected. 

1 4. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

IS. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 1 0 is identified. 

1 9. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO:l-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO:l-1786 and 3573-5358, an active 
domain of SEQ ID NO:l-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO:l-1786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO: 1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO:l-1786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutical^ acceptable carrier. 
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